Search
Browse
Create
Log in
Sign up
Log in
Sign up
Upgrade to remove ads
Only $2.99/month
Exploratory Analysis R
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Key Concepts:
Terms in this set (23)
functions to see the list of factors in a df
levels
Geom to make a bar chart
geom_bar
compare 2 columns in a dataframe
table(1st column, 2nd column)
compare 2 columns proportions in a df
prop.table
what do you pass to prop.table to condition the proportion on a row OR a column?
row 1
column 2
where 1 or 2 is the second proportion
what is unique about the aes when setting up geom bar?
you just need an x, not a y variabe
function to create multiple charts in ggplot
facet_wrap
difference between geom bar and geom dotplot?
geom_dotplot breaks the individual values as dots in a bar
difference between geom histogram and geom bar
geom bar takes counts where histogram breaks down counts by area.
geom for a density chart and what is it?
geom_density makes a line chart similar to a bar chart
how would you pivot or 'transpose' a chart?
add the function coord_flip
geom to make a boxplot
geom_boxplot
in a density chart how would you set the bin width?
bw=
geom_density(bw = 5)
what attribute would you like to change the labels of a facet grid
labeller
4 types of bell curves in standard deviations
unimodal, bimodal, multimodal, uniform
Two major types of skews and the rule of them to remember
left skewed and right skewed, its where the tail is in how it gets its name.
If you are mapping multiple variables in geom density what do you have to do to ensure you can see them all
see an alpha layer
geom_density(alpha =.5)
What does log do in charting and what is its main flaw
log spreads the data out in a chart to get a better understanding of the data. It has to have a value and can't be 0 so you need to add a tiny base number when you do it, e.g. aes(x=log(name+.001))
A factor has some meaningless or small count levels. What do you do the eliminate those levels?
first, filter out the rows containing those levels, then use droplevels() function to eliminate the levels. They still exist as levels even though there isn't any values.
When breaking down a chart into several smaller ones what is the function and what does it need in the attribute
facet_wrap is the function. It needs the field to break it down by as the argument with the tilde which is basically saying broken down by facet_wrap(~name). Note that these are x y coordinates. So you could have the tilde between two variabes
function that is similar to SQLs distinct in dplyr
unique()
function to add a title in ggplot
ggtitle
function to get the intraquartile range
IQR()
YOU MIGHT ALSO LIKE...
Com Sci. Exam 2
44 terms
CGS 2518 Chapter 7
37 terms
Excel Module 3 Study guide
25 terms
itm 175
65 terms
OTHER SETS BY THIS CREATOR
Cluster Analysis
28 terms
Supervised Learning in R
27 terms
Dynamic Yield
18 terms
Correlation and Regression
39 terms