Preface

I don’t know under what context you might be looking at this html, but hello I’m Tim. I’m writing this preface so that no matter where you’re coming from I can explain what you’re about to see.

What is this?

This is a data analysis I completed on some survey data I collected from a Facebook group known as Alex G 666posting. Alex G is a prolific indie artist who has been releasing music since at least 2010. His popularity has continued to grow with each album release, but what makes him such an interesting musician is, including his leaked unreleased tracks, Alex has 200+ songs.

I joined the group back in January of 2019 and have come to really enjoy the community that exists there. In the group a member, Zion, asked how old people were when they first started listening to Alex G and how old they were now. I asked if it was okay, collected that data, and made a figure for the group showing the distribution of ages in the group. I then asked if it was okay to make a survey for the group to analyze more data, I got the okay, specifically from Isabelle, and began working. Many individuals helped with the creation of survey as you can see in my thanks at the bottom, but the final result was 38 columns of data from 211 individuals. Some questions Alex G related, some not. But I hope you find something interesting in this little project I made. I’ve been working on it on and off in my free time since late April finishing it up in late September. Either way, its a labor of love to give back to the great community I found online. The group is full of cool people and I’m glad to be a part of it (I also think this project helped me become a ~~mod~~ admin so that was nice).

Disclaimer

At the time of me writing this 666posting is at the cusp of reaching 3,000 members. That’s great, but we need to talk about sample size because that means we only have ~7.0333333% of the group accounted for. Not only that but as this is an optional survey, one must be aware of the type of people who would take the time to fill out the said survey. This is not an unbiased slice of the group, but I do like to think this includes the core group of active individuals. So with that in mind, let’s get into the analysis.

Libraries used

tidyverse: Used for general exploratory analysis, primarily used dplyr within it
ztable: Used to make various tables
UpSetR: Used to make Upset plots
wordcloud: Used to make word clouds (surprise!)
rworldmap: The map of the world!
mapproj: The map of the US
wesanderson: Wes Anderson Color Palettes
maps: More maps
viridis: Used for colorblind-friendly palette
rstatix: Stat tests
EnvStats: Used to get the N on ggplots for groupings
ggpubr: Arranging the plots
spotifyr: Getting that spotify data!
scales: Percent axis
reshape2: Data melting
ggforce: Sina Plots

Seed

set.seed(666) This was just to keep the consistency of the word clouds, it used to be 123, but uh, Alex G frequently references 666.

Demographics Analysis

So the first demographics we’ll take a quick peek at is at age and race.

First, we’ll look at age at a histogram colored on “what generation do you identify with”. Get used to the age variable, as it is one of our few numerical values, so I’m going to be plotting it a lot. Anyways, overall age is right-skewed, with only one real outlier at 42. Alex G is about 26-27 at the time of me writing this and has been writing music since his early teenage years, so the demographics of teens and 20-somethings is unsurprising.

As a straight white dude (who we will see is the average Alex G listener), I can’t really speak to why or why not Alex is popular or not popular with the other demographics, and frankly I’m not sure it is my place to speculate. As a result, for race, gender, trans identity, and polyamory I will let you draw your own conclusions. I just don’t feel I as an individual should be speaking didactically about such a subject.

The Original Plot

This is the plot the inspired the project, how? Quick storytime:

Zion, a member of 666posting asked individuals “What age were you when you discovered Alex G and what age are you now?”. Everyone started answering the question and I found it interesting. Eventually, it had 200+ individuals and I came up with an idea of making two overlapping histograms, one distribution for each respective question. I asked for permission and then began the first analysis of 666posting. As you’ll see I didn’t stop there with that data, as I thought of other things to do with it, but this eventually led to me asking if I could run a survey which leads to this whole R markdown. So really Zion inspired me to do this, so if you’re reading this, thanks dude!

Another plot I made, using the aforementioned “year learning of Alex G” data I extrapolated is a diagram of his fan base growth over time. I did this by using the “first year people learned of Alex G” data, and overlaying that with his album releases (the albums are colored on the album art). However, unlike my initial making of this plot based on people just commenting on a Facebook post, this data is a little… odd.

Compare to the original figure.

And compared to this figure our new one is definitely different. First of all, we have much more of a normal distribution besides the person… who first heard Alex G in ’07. Frankly, I’m a little dubious and would be interested in hearing that story. The individual did not identify which state they live in, so I thought my worries would be assuaged by hearing they lived in PA, but alas, I have no idea. The other interesting aspect is our first non ’07 fan comes in around 2011, after Race, the first complete Alex G album. I would expect a larger boost with Race, and the previous plot I made did have individuals listening around the time of Race, admittedly few, but more than none. It is important to note Alex G was making music with his band the Skin Cells during the early years as well, which could influence when people heard his solo stuff.

Also compared to the previous plot we have a lot more 2018-20 fans, but I think that’s primarily because the original data was pulled from late March/early April 2020, and this current survey has been rolling open the entire year, so that is not unexpected. But on the topic of the actual data, the data is for the most part normal, which is more indicative of the sample we’re pulling from. I assumed the number of fans should just continue increasing with each album as his fame has only grown. Maybe there is a critical limit of indie fans that he might’ve reached? Who knows! I personally believe this is due to the fact that people trickle into these Facebook groups, so while more people might like Alex now, they might not be in the group (or willing to answer a silly survey). It is interesting that 2015 and the release of Beach Music is his largest jump in popularity, as it is also the same year he signed with Domino Records who probably helped promote his work to a wider audience.

How long have we been listening to Alex G?

This long.

It’s the previous plot inverted, what can I say.

Cumulative Distribution Plot of Fan Growth

I just like cumulative distribution plots, it’s all the same data. If you don’t know how cumulative distribution plots work, basically the line shows over the years the growth of Alex’s fanbase as if counting up to the total, so you can see his percentage gain each year to his current 100% of the fanbase.

Alex G Fans Across the World

So Alex G, for those not in the know, lives in Philly. But, like most successful artists, people want to see him perform in other cities and he tours pretty regularly. So I thought it would be interesting to see where Alex G fans are all over the world!

Plotting the world

A simple plot of the whole world colored on fans per 100,000 individuals in each of those countries.

HOLD UP, WHERE IS ALL THE RED? You may say, to which I respond, “GIVE ME A SECOND I’LL GET TO IT WE GOTTA ZOOM IN EVERYWHERE FIRST TO FIND IT”

North American Alex G Fans

South American Alex G Fans

South East Asia and Oceania Alex G Fans

Europe Alex G Fans

Malta!

So with all that said, the only interesting question I could think of is comparing this data to Alex G’s touring data so… tada! I took this data from setlist.fm, they actually have interesting concert statistics, I suggest checking them out. I actually had a project idea where I was gonna scrape their data and make an average Alex G setlist for each year and generate a naive Bayes model/shiny app allowing people to put in their perfect setlist and return the probability of that concert happening. But they already did the first thing, and I ended up doing this entire analysis instead of the naive Bayes thing, that’s life. Back to what I was talking about:

Alex G Concerts Around the World from setlist.fm

I put the setlist data into a csv file so I could use it in R, and I decided to plot the number of concerts in a country versus the number of fans. But you can see when I first saw the data….

America ruined the graph. So I decided to log scale it and make it more informative with shapes and colors.

Ireland had 2 concerts and no fans which is why you can’t see it :(

Number of concerts per fan linear model with America

## 
## Call:
## lm(formula = concerts ~ fans, data = joined_country_concerts)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -15.1377  -0.0644   1.6368   2.0591   6.7277 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -1.25582    1.42528  -0.881    0.393    
## fans         1.52811    0.04239  36.050 3.29e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.297 on 14 degrees of freedom
## Multiple R-squared:  0.9893, Adjusted R-squared:  0.9886 
## F-statistic:  1300 on 1 and 14 DF,  p-value: 3.287e-15

Number of concerts per fan linear model without America

## 
## Call:
## lm(formula = concerts ~ fans, data = no_america_concerts)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.8957 -1.0379 -0.0379  2.4785  9.1371 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept)   1.0489     1.6272   0.645  0.53041   
## fans          0.9891     0.2444   4.047  0.00138 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.674 on 13 degrees of freedom
## Multiple R-squared:  0.5575, Adjusted R-squared:  0.5235 
## F-statistic: 16.38 on 1 and 13 DF,  p-value: 0.001384

Basically the amount of concerts in a country increase by 1 for each 1.5 fans in said country, but that is driven by the outlier known as the US. If we remove the US the results changes to an increase of 1 concert per 0.99 fans.

Alex G fans in the United States

So now let us look at the Alex G Fans around the US.

I used the below vignette to guide this section https://cran.r-project.org/web/packages/usmap/vignettes/mapping.html

I got the census data from census.gov particularly the file named: NST-EST2019-01: Table 1. Annual Estimates of the Resident Population for the United States...

I’m not gonna lie, I was lazy and cleaned up the data in excel, but

basically I filtered to only the 2019 data. I used this data primarily to control for population in the coloration of the maps below. I don’t really have much to say on them so enjoy!

Plotting all of the US

Plotting the North East

Plotting the South

Plotting the Midwest

Plotting the West

So the analysis I decided to do here was ask “Where is Alex’s fan base the largest (when controlling for population size) in the United States?” I pulled this region data from kaggle and will use it to test our hypothesis. “Our hypothesis?” you say? Yes, obviously it is the East Coast which has the most representation, if it’s not, all hope will be lost.

(P.S. I’ve learned of the state.region data in R, but womp womp here we R, let’s just go with it)

## 
##  4-sample test for equality of proportions without continuity
##  correction
## 
## data:  summarised_region$fans out of summarised_region$pop
## X-squared = 3.4198, df = 3, p-value = 0.3313
## alternative hypothesis: two.sided
## sample estimates:
##       prop 1       prop 2       prop 3       prop 4 
## 4.390522e-07 5.001536e-07 3.264839e-07 4.199483e-07

Well there doesn’t seem to be a clear winner… I’ll just tell myself if we included Southern East Coast states it would’ve won.

666posting Political Compass

Spice Alert

So one of the first graphs I generated with the survey data, because I thought it would be a fun one to do, is looking at the political leanings in the group. While politics has a lot more depth than can be represented by my figure, I think having two axes, left to right and libertarian to authoritarian, allows a fair amount of depth in itself. The question was phrased as follows:

“Where do you stand on the political compass? This website will calculate it, but you can do it by personal feel as well by looking at the image! https://www.politicalcompass.org/ (center at 5 and round if you get a score from the website, site ~10 minutes)”

With the below image as a guide. The image isn’t completely accurate if you ask me, but from a quick google search, it does a well enough job.

The reason that I said this is spicy, is because, let’s be honest, we live in an incredibly politically charged time. And when I posted this, I completely understood people not being comfortable with individuals having, uh, “strong fascist leanings” so it caused a bit of a row (Never a good look when your post has more comments than likes). That said, I still think the data is pretty interesting so let’s look at it.

Political plots

As the data is technically a non-continuous discrete ordinal value, as I asked for answers in integers, this is technically one of the better ways to represent the data. The size of the circle represents the number of individuals who selected said option. We will later have more readable examples of this data later on in case you’re interested in exact numbers.

Second political compass scatterplot with a jitter and the shape of the points relating to gender. In case you’re wondering why some points are off of the figure that is due to the jitter, I apologize for that.

The same scatterplot as above, but this time the dots are colored on the age of the individuals. I had to remove the individual over 40 because that threw the scale off even when I used a log scale, so my apologies to them.

Let us see if we can break this data up into the different quadrants, including wiggle room in the middle for centrists. Again its a scale of 0-10, so I’m gonna have 4-6 as centrist.

While this plot is interesting, I don’t think there is a good way to test these groups because some of them have very few individuals, however there is another way we can look at this.

## # A tibble: 2 x 5
##   term        estimate std.error statistic   p.value
##   <chr>          <dbl>     <dbl>     <dbl>     <dbl>
## 1 (Intercept)  23.2        0.414    56.1   6.70e-124
## 2 left_right    0.0375     0.139     0.270 7.88e-  1

Well that’s not significant, how about being libertarian or authoritarian?

## # A tibble: 2 x 5
##   term        estimate std.error statistic   p.value
##   <chr>          <dbl>     <dbl>     <dbl>     <dbl>
## 1 (Intercept)  23.5        0.467    50.3   4.33e-115
## 2 lib_auth     -0.0644     0.124    -0.519 6.04e-  1

Well, I guess I can say, specifically in this 666posting cohort, your age doesn’t seem to influence your political leanings. Sorry, that wasn’t interesting, but again there are some groupings with very few samples so that didn’t help either and I didn’t remove the outliers.

Musicians in 666posting

Another question asked is if you play instruments or not, and if so which ones?

Preparing the data and make a melted version of the data.frame

##  [1] "Guitar"           "Banjo"            "Voice"            "Accordion"       
##  [5] "Bass"             "Violin/Viola"     "Percussion/Drums" "Piano/Keyboard"  
##  [9] "Brass"            "Music Software"   "Cello"            "Mandolin"        
## [13] "Sampler"          "Woodwind"         "Ukulele"          "omnichord"       
## [17] "Chinese Flute"    "clarinet"         "synthesiser"

Generate general summaries to be used in the first plot

Plot of instruments played in 666posting

The case of the Upset Plot vs the Venn Diagram

Okay, here I’m about to go on a visualization tirade. So I’m sure you’re used to Venn diagrams, but have probably not heard of Upset plots. So Venn diagrams are the classic way to represent count crossover between multiple groups, but when groups are large this generally reduces to circles with numbers inside of them. This doesn’t do the scale of the differences proper justice, and people, in general, have a better time understanding the magnitude of difference when there is a visualization. And people are especially good at comparing sizes of adjacent bars. As a result, a new plot has been formed called Upset plots.

I will explain how to read them now. An Upset plot is comprised of two bar plots that relate to the central image with balls and lines. The barplot on the “y-axis” of the center image is the total count for how many people play each instrument. If we look at the center image, you can see each row is labeled by an instrument. Each column however has different dots representing each instrument. When a dot is filled that shows an overlap between the two instruments, the amount of those overlaps is what is counted in the “x-axis” barplot above. So one plot gives you the total number of people who play a certain instrument, while the other plot gives you how many people share the same overlap of instruments. I hope this makes, sense, but it will be readily apparent once you see the plots.

Anyways the way to read this diagram is as follows: The total count of each instrument is in the bottom left corner, you can see that we have over ~120 guitarists who answered the survey followed by ~70 keyboard/piano players. Then if you look to the right of that graph you’ll see dots and lines. If two dots are filled that means that the bar above it represents the intersection of individuals who play both those instruments. As more dots are filled that represents a greater intersection. So you can see that there are 27 solo guitarists, while there are 12 people who play all of the instruments in this graph.

As I said, I couldn’t fit in all the data because of the limitations so sorry to all the unique instrument players, but at least you got a shout out in the first graph you Chinese flute-playing god.

First Upset Plot

The way to read this diagram is as follows: The total count of each instrument is in the bottom left corner, you can see that we have 121 guitarists who answered the survey followed by r nrow(instr_dat %>% filter(`Piano/Keyboard`== 1)) keyboard/piano players. Then if you look to the right of that graph you’ll see dots and lines. If two dots are filled that means that the bar above it represents the intersection of individuals who play both those instruments. As more dots are filled that represents a greater intersection. So you can see that there are 27 solo guitarists, while there are 12 people who play all of the instruments in this graph.

Second Instrument plot, it contains more instruments

Fun fact: there are 2 who play the same instruments as Alex, including the guitar, bass, drums, banjo, voice, mandolin, music software, and keyboard (to my knowledge).

How many musicians are in a band?

One last quick question about our instrumentalists, are you in a band?

Frankly, I expected “No” to be the dominant answer, but I am surprised by how good a fight “Yes” put up. Maybe that’s my personal bias of knowing more musicians who aren’t in bands.

Non-musical art

This is basically the same analysis as above, but this time we’re looking at the rest of the world of The Arts!

Plot of art mediums in 666posting

Art Upset Plot

Favorite Music/Musicians

Word Clouds

So… word clouds. Technically speaking, word clouds are never a good way to represent data in a meaningful way. While the size of the letters increases one can still be tricked into thinking, in this case, long band names have more votes, which isn’t true. As a result, you’ll see that I ended up coloring the data a little bit to help clarify this visual discrepancy.

Word cloud For Favorite musician

If they have more than one fan, the name is in orange.

Barplot for top 10 musicians except Alex G because he makes the plot look bad

Alex has 80 fans by the way.

Word cloud For Second Favorite musician

Alex has 33 individuals who consider him their second favorite artist.

Favorite Genres

Word cloud For Favorite Genre

Favorite Genre Barplot

How did you find Alex G?

First, we gotta do some MASSIVE data cleaning, I really should not have left this open for people to put whatever they want, look at all the unique comments, as of writing there are 50. So we gotta find a way to generalize these. That said, some are very interesting.

##  [1] "Through a friend"                                                               
##  [2] "Spotify/Apple Suggested Music"                                                  
##  [3] "I know him personally"                                                          
##  [4] "DIY show "                                                                      
##  [5] "Youtube"                                                                        
##  [6] "Press"                                                                          
##  [7] "Tumblr"                                                                         
##  [8] "College Radio"                                                                  
##  [9] "Reddit"                                                                         
## [10] "Compilation Albums"                                                             
## [11] "I read an article from Pitchfork about the release of DSU"                      
## [12] "Suggested by another artist I listen to"                                        
## [13] "Pitchfork"                                                                      
## [14] "SoundCloud"                                                                     
## [15] "Toured Together"                                                                
## [16] "Crush at the time"                                                              
## [17] "another FB group"                                                               
## [18] ""                                                                               
## [19] "4chan"                                                                          
## [20] "Soundcloud"                                                                     
## [21] "saw his name on a festival lineup"                                              
## [22] "Soundcloud recommendation"                                                      
## [23] "Flaked on Netflix"                                                              
## [24] "A girl in twitter who made playlists"                                           
## [25] "Record store clerk"                                                             
## [26] "through music blogs like NME or pitchfork"                                      
## [27] "Radio"                                                                          
## [28] "Nintendo 64 cover on a Ztapes compilation"                                      
## [29] "pitchfork's rocket review, sorry!"                                              
## [30] "8tracks.com"                                                                    
## [31] "Wikipedia"                                                                      
## [32] "Festival Lineup"                                                                
## [33] "Music journalists/festivals"                                                    
## [34] "Saw him perform at a festival"                                                  
## [35] "Music blog"                                                                     
## [36] "Facebook"                                                                       
## [37] "He performed at a festival I attended"                                          
## [38] "lofi record labels (specifically birdtapes and orchid tapes)"                   
## [39] "From a vine"                                                                    
## [40] "Through facebook friends"                                                       
## [41] "tumblr"                                                                         
## [42] "Through my brother (maybe this comes under friend?)"                            
## [43] "420 Love Songs Compilation (wasnt sure if you meant Alex G compilations sorry!)"
## [44] "Through the Spotify playlists of a YouTuber I like"                             
## [45] "i really canâ\200\231t remember "                                                     
## [46] "radio podcast"                                                                  
## [47] "at a concert (Living Bread, brooklyn, may 2013)"                                
## [48] "was the support at a show i was at "                                            
## [49] "Flake"                                                                          
## [50] "GTA V lol"

Cleaning the data so its presentable

If you’re viewing the version without the code you might not know what is going on here, but basically I am using key words in all of the non-normal options to fit them into more normal categories. For example if the answer includes a website, I detect the unique letters in that character string (“umblr” fo Tumblr) and then rename that answer to “other website”. This is a more interesting section to look at via the coded version of this file.

##  [1] "Through a friend"              "Spotify/Apple Suggested Music"
##  [3] "I know him personally"         "Live Music"                   
##  [5] "Other Website"                 "Journalism"                   
##  [7] "Radio"                         "Compilations/Label"           
##  [9] "Other media"                   "Online Personality"

Making the plot now simplified including a table of how we listen to our music

Alex G Music

Now we’re getting into that good Alex G data, favorite albums, favorite songs, and more!

Favorite Album

Favorite Album Plot

A barplot of favorite released Alex G albums

Favorite Unreleased Album/Compilation

A barplot of favorite Alex G Fan compilations, also for those of you that don’t listen to his unreleased stuff, you should give it a listen.

Overall it seems that people don’t listen to them, but as I said, I suggest it. The winner of them is by a large margin Monsterhead, which makes sense to me. Out of all his compilations, I feel it is the most stacked with well known unreleased tracks (Nintendo 64, Uh, Written in Blood, etc.)

Favorite Music Video

I mean, Gretel’s video is really good

Word Cloud of Favorite Alex G Songs

A word cloud of everyone’s favorite Alex G songs from the survey, colored by what album they’re in, size based on the number of people that selected a given song as their favorite. So as you can see the top two favorite songs of those who answered the survey is Snot and Gnaw, and tbh, I’m SNOT surprised. Ugh, anyways the thing I actually love about this image is, because of how Alex names his songs, there are cool phrases that are generated. Personally I love “Screwy People, I Wait For You”, it’s just fun sticking his song titles together.

A bar plot of the top 10 favorite Alex G songs, just to ascribe hard numbers to the data, which is the issue with word clouds as said before.

Looks like Gnaw is our winner!

An analysis of favorite song vs favorite album

A quick question I wanted to ask is: “Is it more likely that your favorite song is on your favorite album?” Which is what I did below. I made sure to only include favorite songs on released albums, as with unreleased material stuff gets dicey. I ran a chi-square, which has the null hypothesis “There should be no correlation between one’s favorite song and their favorite album” meaning that the count should be split between “Yes, my favorite song is on my favorite album” and “No, my favorite song is not on my favorite album” and it turns out….

## 
## FALSE  TRUE 
##   112    64

## 
##  Chi-squared test for given probabilities
## 
## data:  table(songs_dat_test$fav_song_fav_album)
## X-squared = 13.091, df = 1, p-value = 0.0002967

It is more likely that you will not have your favorite song in your favorite album, statistically speaking. Frankly if you compare favorite songs to favorite albums I’m just gonna blame Gnaw for this one.

The spotifyr package

I would first like to thank a friend of mine, Stephanie Y., who looked over this markdown for me and suggested this package.

You might be wondering, “what is spotifyr?”, and to put it simply, it is an R package that allows me to effectively use Spotify’s API to get data about Alex G and his music. While I could do a lot with this data, what I’m most interested in is the different “qualities” that Spotify attributes to various songs. This is done using machine-learning on the songs to determine their “valence” (sad to happy; 0 to 1), danceability (0 to 1), energy (0 to 1), etc. I’m going to make tables showing the overall top and bottom songs for each of these categories, and honestly, I’m dubious a. hell about these. For example, they have a quality known as “liveness” which determine if the song is recorded live or not. The number 1 “liveness” song is Brick meanwhile Sugar House - LIVE, is third! We also have loudness in decibels which goes from Clouds (thanks Luke) to Brick. But if you want to know more click here, its interesting anyways.

note: All scales are from 0-1 except loudness which is decibels.

note 2: ~~PUT RACE ON SPOTIFY PLEASE~~

Generating tables with the top 5 and bottom 5 tracks for a select handful of Spotify qualities.

Now that you’ve probably drawn your own opinions about the validity of this data (reminder: these are the top 5 and bottom 5 songs per song quality) let’s look over Alex’s albums and then we’ll see how this intersects with our music.

Alex G albums as an emotional rollercoaster

Technically not the best way to represent the data but…..

Hearing loss due to Alex G albums (loudness)

SHUT UP SHUT UP

note: Remember, the value is in decibels, the closer to 0 the louder it is

I find the overall increase of volume over his career interesting for two reasons. First, Alex mixed and mastered his songs until Beach Music, which was engineered by someone at Domino Records, and the Domino albums seem to have more consistent volume level. Second, it is known that overtime music has only gotten louder, and it seems to be true for Alex as well. Actually let us test this real quick

Has Alex gotten louder over time?

## 
## Call:
## lm(formula = loudness ~ album_release_year, data = alex_g)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -11.057  -1.770   0.153   2.055   5.287 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)   
## (Intercept)        -640.2495   215.2093  -2.975  0.00373 **
## album_release_year    0.3136     0.1069   2.935  0.00421 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.997 on 93 degrees of freedom
## Multiple R-squared:  0.08476,    Adjusted R-squared:  0.07492 
## F-statistic: 8.613 on 1 and 93 DF,  p-value: 0.004205

So, to an extent, yes the theory is correct, his music seems to get louder as time passes, and at least Rules and Trick had a relatively larger distribution of volume levels compared to his later work.

Energy

Danceability

Do Alex G members have a musical preference?

Is there a secret formula to Alex G’s top hits? Maybe we can use this data to figure it out! By intersecting the different Spotify qualities and 666posting members favorite songs we can see if the distribution of favorites is different than Alex’s discography. Remember though, this will be excluding non-officially released tracks.

## 
##  Two-sample Kolmogorov-Smirnov test
## 
## data:  filtered_songs$danceability and filtered_alex_g$danceability
## D = 0.15808, p-value = 0.09332
## alternative hypothesis: two-sided

## 
##  Two-sample Kolmogorov-Smirnov test
## 
## data:  filtered_songs$energy and filtered_alex_g$energy
## D = 0.077578, p-value = 0.8542
## alternative hypothesis: two-sided

## 
##  Two-sample Kolmogorov-Smirnov test
## 
## data:  filtered_songs$valence and filtered_alex_g$valence
## D = 0.085427, p-value = 0.7621
## alternative hypothesis: two-sided

Well these are interesting distributions. Alex has a somewhat normal curve for danceability, but there seems two be a bimodal distribution amongst fan favorite songs. As a result, it does lean more towards danceability preference. With energy people prefer lower energy, with fan preference taking a dip around 0.8. Lastly with valence, which I will remind is basically happiness, there is a low peak around .30 of ution from danceability trends towards signifigance. sad peeps with some happy nerds around .75, breaking through the discography distribution. Statistically speaking, using Kolmogorov-Smirnov tests, which tests if there is a similar distribution, we see that there is no significant difference, but that bimodal distribution from danceability trends towards significance.

Personality Questions

Myers Briggs

Myer-Briggs is the zeitgeist when it comes to personality tests. For more information on them check out this website: https://www.16personalities.com/free-personality-test . While there are sixteen personalities, they are not evenly distributed in the population according to data from the official MBTI types, some make over 10% of the population others closer to 2%. These types can also be broken down further into their more discrete types based on the letters, Extrovert vs Introvert (E v I) for example. I have two figures below showing both of these distributions in the general population. We will then compare this to the Alex G data.

General Population

Not really sure what the best question to ask here is besides the distribution and how the distribution differs from the normal world. Data pulled from here

General population plots Myer-Briggs Type

General population letter types

Now the 666posting population

Run the Chi-square analysis against general population probabilities

Plotting MBTI Types

Plotting Single MBTI Letter Types

MBTI Analysis

The group, compared to the general population, well, they look NOTHING alike. While the population data is relatively outdated cough cough hasn’t been updated since 2002 , our distribution is not close to it at all, below I will have the outputs side by side. A good portion of our members come from the rarer types, and specifically, 18.23% of participating members are the rarest type, INFJ, which makes up ~2% of the normal population. Crazy. This, of course, carries over to the discrete types as well, the starkest difference is the Intuitives considerably outnumber the Sensors in this group (I don’t know what that means, but if you want to explain it in the comments feel free to), which is the opposite in the general population. It’s wacky folks.

Note: The reason the P-values are the same is that I have the chi-square set to simulate.pvalues as the distribution of values throws results in a warning, this isn’t the best dataset for this test

Birthday/Zodiacs

Gonna be honest, I am under qualified to say anything about what zodiac signs mean, so I’m going to leave these results up to your interpretation. That said, one thing I do know is these signs break up into the four classic elements, fire, water, air, and earth which are included in my analysis. So to generate the population data I pulled CDC birth records from 2000-2014 as an estimate. There appears to be a relatively uniform distribution of signs, and similarly elements in the general population.

General Population

Pulling public data to make a fair general population example and then preparing it

Making both colors and general population plots for zodiac signs

Also zodiac elements

Preparing the Alex G data for the same analysis done above

In our group however, this aforementioned balanced distribution doesn’t hold true. Capricorns make up around 3% of our group while Gemini makes up ~12%. That said, this is not a significant change in the distribution according to a chi-square analysis. The same is true for elements, the change nears significance, but does not cross the arbitrary (thanks Fisher) 0.05 threshold, though overall there are less Earth and Water signs in this group.

Basic Alex G zodiac sign plot

Plotting the zodiac elements of the Alex G data

Astrology Analysis

In our group however, this aforementioned balanced distribution doesn’t hold true. Capricorns make up around 3% of our group while Gemini makes up 11.3744076%. That said, this is not a significant change in the distribution according to a chi-square analysis. The same is true for elements, the change nears significance, but does not cross the arbitrary (thanks Fisher) 0.05 threshold, though overall there are less Earth and Water signs in this group.

Hogwarts House

General Population

I’m also not great with Harry Potter info, but this was relatively interesting because there is some insight to be gained. I had to eyeball the general population percentages from the article below because they didn’t actually give the true numbers What’s interesting is that the main difference between the general population and 666posting is that we have more Slytherin. What makes this even more interesting, is according to the Time’s article, Slytherin makes a large proportion of the younger population, and I’d say that this group leans to the younger side (see previous above ages). And so, I’d argue this cohort doesn’t stray from the normal population distribution when adjusted for age. But we don’t have enough older people to properly adjust for age, so just take my word for it, K? Great!

~~Data taken from https://drive.google.com/file/d/0B8PCmhQmtcDKLXlzSGtnZ0hKbjQ/view~~ Data taken from Time Magazine, they don’t have the actual percents posted, but a much larger sample size.

Re-plot the Times Data

Preparing and cleaning Alex G data

Now we have general population prepare 666posting data

Chi-Square for houses

## 
##  Chi-squared test for given probabilities
## 
## data:  house_dat$count
## X-squared = 26.732, df = 3, p-value = 6.699e-06

Plotting Alex G Hogwarts Houses

As I said above, the only thing of note is the change in Slytherin, which I attribute to the trend that Slytherin’s are generally younger, and this group a younger sample when compared to the overall population.

Are there correlations between types?

My last question, and the one I found most interesting is, are personality types related to one another? E.g. are certain MBTI types enriched in a given zodiac sign? Long story short, kinda, kinda not. I made a bunch of Heatmap Tables using an R package called Z-table and calculated Chi-squares and while there were a few hits I assure you none would stand up to any multiple testing correction, so I don’t have any larger statement to say here. I feel like we are underpowered in participants to test this hypothesis as not everyone answered all three survey questions, and even if we did we still might need a larger N.

note: I’m going to not echo these as the code is repetitive.

Using Ztables from this link

Ztable signs vs MBTI

Heatmap Table of Zodiac Signs and Myer-Briggs Types; Chi-Sqare p-value of 0.089
	Aquarius	Aries	Cancer	Capricorn	Gemini	Leo	Libra	Pisces	Sagittarius	Scorpio	Taurus	Virgo
ENFJ	2	2	1	0	0	0	2	0	2	1	0	1
ENFP	2	5	5	1	4	3	3	2	4	1	2	2
ENTJ	0	0	0	1	0	0	0	0	0	1	0	0
ENTP	0	0	2	0	0	1	1	0	0	1	0	1
ESFP	0	0	0	0	1	1	0	0	0	0	1	0
INFJ	1	1	2	0	4	5	2	2	6	4	2	4
INFP	10	6	1	4	1	5	5	4	7	0	4	4
INTJ	0	0	0	1	1	0	3	3	2	1	0	3
INTP	1	0	0	2	7	1	1	3	0	1	1	3
ISFJ	0	0	0	0	0	1	0	0	0	0	0	0
ISFP	0	0	0	0	1	0	0	0	0	1	1	0
ISTP	0	0	0	0	1	0	1	0	0	0	0	0

Ztable Zodiac Elements vs MBTI

Heatmap Table of Zodiac Elements and Myer-Briggs Types; Chi-Sqare p-value of 0.546
	ENFJ	ENFP	ENTJ	ENTP	ESFP	INFJ	INFP	INTJ	INTP	ISFJ	ISFP	ISTP
Air	4	9	0	1	1	7	16	4	9	0	1	2
Earth	1	5	1	1	1	6	12	4	6	0	1	0
Fire	4	12	0	1	1	12	18	2	1	1	0	0
Water	2	8	1	3	0	8	5	4	4	0	1	0

Ztable for Zodiac Signs and HP Houses

Heatmap Table of Zodiac Signs and Hogwarts Houses; Chi-Sqare p-value of 0.968
	Aquarius	Aries	Cancer	Capricorn	Gemini	Leo	Libra	Pisces	Sagittarius	Scorpio	Taurus	Virgo
Gryffindor	2	1	1	1	2	2	2	1	4	0	1	2
Hufflepuff	4	3	1	0	3	5	2	2	7	3	4	4
Ravenclaw	3	4	1	1	9	4	5	2	4	2	3	4
Slytherin	4	0	2	1	2	2	3	0	3	4	1	3

Ztable for Zodiac Elements and HP Houses

Heatmap Table of Zodiac Elements and Hogwarts Houses; Chi-Sqare p-value of 0.774
	Air	Earth	Fire	Water
Gryffindor	6	4	7	2
Hufflepuff	9	8	15	6
Ravenclaw	17	8	12	5
Slytherin	9	5	5	6

MBTI vs Houses

Heatmap Table of Hogwarts Houses and Myer-Briggs Types; Chi-Sqare p-value of 0.368
	Gryffindor	Hufflepuff	Ravenclaw	Slytherin
ENFJ	3	1	3	1
ENFP	6	6	8	7
ENTJ	0	0	0	1
ENTP	0	0	2	0
INFJ	2	11	7	4
INFP	3	13	9	6
INTJ	3	1	3	1
INTP	1	1	4	4
ISFJ	0	1	0	0
ISFP	0	1	0	0
ISTP	0	1	0	0

MBTI letters vs Zodiac Signs

Prepare data

Sign and IE

Heatmap Table oF Zodiac Signs and Intro vs Extra ; Chi-Sqare p-value of 0.182
	Aquarius	Aries	Cancer	Capricorn	Gemini	Leo	Libra	Pisces	Sagittarius	Scorpio	Taurus	Virgo
E	4	7	8	2	5	5	6	2	6	4	3	4
I	12	7	3	7	15	12	12	12	15	7	8	14

Sign and SN

Heatmap Table oF Zodiac Signs and Intro vs Extra ; Chi-Sqare p-value of 0.128
	Aquarius	Aries	Cancer	Capricorn	Gemini	Leo	Libra	Pisces	Sagittarius	Scorpio	Taurus	Virgo
N	16	14	11	9	17	15	17	14	21	10	9	18
S	0	0	0	0	3	2	1	0	0	1	2	0

Sign and FT

Heatmap Table oF Zodiac Signs and Intro vs Extra ; Chi-Sqare p-value of 0.0045
	Aquarius	Aries	Cancer	Capricorn	Gemini	Leo	Libra	Pisces	Sagittarius	Scorpio	Taurus	Virgo
F	15	14	9	5	11	15	12	8	19	7	10	11
T	1	0	2	4	9	2	6	6	2	4	1	7

Sign and JP

Heatmap Table of Zodiac Signs and Judge vs Perceive ; Chi-Sqare p-value of 0.325
	Aquarius	Aries	Cancer	Capricorn	Gemini	Leo	Libra	Pisces	Sagittarius	Scorpio	Taurus	Virgo
J	3	3	3	2	5	6	7	5	10	7	2	8
P	13	11	8	7	15	11	11	9	11	4	9	10

MBTI letters vs Zodiac Elements

Element EI

Heatmap Table oF Zodiac Elements and Intro vs Extra ; Chi-Sqare p-value of 0.469
	Air	Earth	Fire	Water
E	15	9	18	14
I	39	29	34	22

Element SN

Heatmap Table oF Zodiac Elements and Intro vs Extra ; Chi-Sqare p-value of 0.849
	Air	Earth	Fire	Water
N	50	36	50	35
S	4	2	2	1

Element FT

Heatmap Table oF Zodiac Elements and Intro vs Extra ; Chi-Sqare p-value of 0.0115
	Air	Earth	Fire	Water
F	38	26	48	24
T	16	12	4	12

Element JP

Heatmap Table of Zodiac Elements and Judge vs Perceive ; Chi-Sqare p-value of 0.536
	Air	Earth	Fire	Water
J	15	12	19	15
P	39	26	33	21

Hogwarts MBTI Letter Analysis

Prepare data

House IE

Heatmap Table of Hogwarts Houses and Intro vs Extra ; Chi-Sqare p-value of 0.123
	Gryffindor	Hufflepuff	Ravenclaw	Slytherin
E	9	7	13	9
I	9	29	23	15

House SN

Heatmap Table of Hogwarts Houses and Intro vs Extra ; Chi-Sqare p-value of 0.141
	Gryffindor	Hufflepuff	Ravenclaw	Slytherin
N	18	33	36	24
S	0	3	0	0

House FT

Heatmap Table of Hogwarts Houses and Intro vs Extra ; Chi-Sqare p-value of 0.234
	Gryffindor	Hufflepuff	Ravenclaw	Slytherin
F	14	33	27	18
T	4	3	9	6

House JP

Heatmap Table of Hogwarts Houses and Judge vs Perceive ; Chi-Sqare p-value of 0.779
	Gryffindor	Hufflepuff	Ravenclaw	Slytherin
J	8	14	13	7
P	10	22	23	17

Habits and Entertainment

Substance Usage

Let us first make some barplots of both drugs followed by a z-table to see what our distribution of recorded substance usage is. I will also run a Chi-square (because the function already does that) to see if the distribution of usage is unexpected, aka, are weed use and alcohol use random, or are they correlated in some manner. I am also ordering factors to make the tables and following graphs more readable.

Bringing in data about weed use about the world world and the states

Alcohol and weed barplots

Now lets look at that Z table

Heatmap Table of Alcohol and Weed Usage ; Chi-Sqare p-value of 0.021
	I have never smoked weed	I don't smoke weed anymore	Once a month	Once a week	Multiple times per week	Every day	Multiple times a day
I never had alcohol	5	2	0	0	1	0	1
I don’t drink anymore	1	6	1	0	2	1	4
Once a month/socially	10	27	12	1	7	8	10
Once a week	2	16	9	4	5	3	3
Multiple times per week	1	25	12	2	11	2	5
Every day	0	6	0	0	1	1	1
Multiple times a day	0	0	1	0	0	0	1

So it appears, according to the the Chi-Square that our data might be trending towards significance, but there is nothing to be concluded here. Overall it appears most people use to smoke weed, but not anymore, and most people seem to drink socially.

Now let us see if there are any interesting distributions with age

Above is an interesting way to view the distribution, but to see differences there are better plots such as the box-plot below:

These distributions don’t seem that different from one another. As a test, I’ll run a linear model to see if these drinking groups are a good predictor of age. That said, I’m not really going to check any assumptions (don’t do this), I will remove the one older individual because they are an outlier, but I don’t expect there to be a difference.

## # A tibble: 7 x 5
##   alc_use                 variable     n  mean    sd
##   <fct>                   <chr>    <dbl> <dbl> <dbl>
## 1 I never had alcohol     age          9  21.8 5.12 
## 2 I don't drink anymore   age         15  23.7 4.17 
## 3 Once a month/socially   age         75  22.3 2.66 
## 4 Once a week             age         42  23.5 2.79 
## 5 Multiple times per week age         57  24.3 3.51 
## 6 Every day               age          9  24.7 1.94 
## 7 Multiple times a day    age          2  26.5 0.707

## 
## Call:
## lm(formula = center_age ~ alc_use, data = no_out_drugs)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.2667 -2.2667 -0.2667  2.3333 10.2667 
## 
## Coefficients:
##                                Estimate Std. Error t value Pr(>|t|)  
## (Intercept)                     -1.5045     1.0528  -1.429   0.1545  
## alc_useI don't drink anymore     1.9556     1.3317   1.468   0.1435  
## alc_useOnce a month/socially     0.4889     1.1142   0.439   0.6613  
## alc_useOnce a week               1.6984     1.1602   1.464   0.1448  
## alc_useMultiple times per week   2.4854     1.1329   2.194   0.0294 *
## alc_useEvery day                 2.8889     1.4889   1.940   0.0537 .
## alc_useMultiple times a day      4.7222     2.4691   1.913   0.0572 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.159 on 202 degrees of freedom
## Multiple R-squared:  0.08829,    Adjusted R-squared:  0.06121 
## F-statistic:  3.26 on 6 and 202 DF,  p-value: 0.004415

Nothing significant, but social drinking trends towards younger individuals which is interesting.

And now I’m going to reveal all the filthy filthy lawbreakers in this group, I’m talking under-age drinking. (I’m being sarcastic, but I think its an interesting question)

Disgusting…

Weed usage plots

Alright on to the completely legal topic of weed usage

Again the better way to view the data

Let us test if there is a significant difference between age and weed use.

## # A tibble: 7 x 5
##   weed_use                   variable     n  mean    sd
##   <fct>                      <chr>    <dbl> <dbl> <dbl>
## 1 I have never smoked weed   age         19  22    3.40
## 2 I don't smoke weed anymore age         82  23.4  3.46
## 3 Once a month               age         35  23.4  3.09
## 4 Once a week                age          7  25.1  2.91
## 5 Multiple times per week    age         26  22.4  2.37
## 6 Every day                  age         15  21.9  2.15
## 7 Multiple times a day       age         25  25.0  3.35

## 
## Call:
## lm(formula = center_age ~ weed_use, data = no_out_drugs)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.3537 -2.3537 -0.4231  1.9600  9.6463 
## 
## Coefficients:
##                                    Estimate Std. Error t value Pr(>|t|)   
## (Intercept)                        -1.28230    0.72731  -1.763  0.07940 . 
## weed_useI don't smoke weed anymore  1.35366    0.80719   1.677  0.09509 . 
## weed_useOnce a month                1.40000    0.90341   1.550  0.12278   
## weed_useOnce a week                 3.14286    1.40171   2.242  0.02604 * 
## weed_useMultiple times per week     0.42308    0.95684   0.442  0.65885   
## weed_useEvery day                  -0.06667    1.09500  -0.061  0.95151   
## weed_useMultiple times a day        3.04000    0.96489   3.151  0.00188 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.17 on 202 degrees of freedom
## Multiple R-squared:  0.08148,    Adjusted R-squared:  0.0542 
## F-statistic: 2.987 on 6 and 202 DF,  p-value: 0.008121

Weed usage interestingly shows older people in this group tend to consume weed more than their younger counterparts. I feel this is somewhat driven by outliers, but “once a week” weed usage, with no outliers, was also significant. Part of me wonders how much of this has to do with the financial situation/living situation. Perhaps older individuals smoke more because they have both the financial ability and space (free from those who might look down on it) to do so. I’m not sure, it would’ve been pretty weird if I asked “what is your yearly income?” and “Do you live with an authority figure?”.

Table of usage compared to laws

Weed use to compared to country/state laws
	Actively smokes weed	Doesn't actively smoke weed
not legal	22	25
medical/complicated	34	39
medical	29	20
decriminalized/complicated	1	4
recreational	23	13

Concerts, General and Alex G Specific

In general

Alex G concerts

Making a histogram showing the general results. As you can see there are some outliers, but I have theories on why these outliers exist.

My theory is they live near Philly, and by using the state data I’ve collected I can determine that two of the largest outliers indeed do live near Philly (defined by living in Pennsylvania, New Jersey or Delaware). That said, the individual who has seen Alex 40 times lives in Missouri.

Fun fact: According to setlist.fm Alex has played 284 concerts

But another theory I had is that maybe living near cities in general will increase the probability of an individual going to more concerts

Violin plot of living area and going to concerts

Violin plot of concert frequency and number of Alex G concerts

That “once a year if ever” dude with 40 Alex G concerts must be 40 then…. something is telling me that might’ve been a mistake.

I decided to use some linear models to check if any of these variables were good predictors

##                                               Estimate Std. Error       t value
## (Intercept)                               1.744968e-13   3.740517  4.665046e-14
## living_areaIn the outskirts of a city    -1.999617e-01   1.313950 -1.521837e-01
## living_areaIn the suburbs                 1.120113e-01   1.256890  8.911790e-02
## living_areaIn a city                      1.381653e+00   1.198747  1.152581e+00
## `Lives Near Philly?`Yes                   3.707999e+00   1.062184  3.490920e+00
## concertsOnce a year if ever               2.093536e+00   3.979742  5.260482e-01
## concertsEvery 3-4 months                  9.088468e-01   3.916566  2.320520e-01
## concertsEvery month or every other month  1.736156e+00   3.951054  4.394159e-01
## concertsEvery two weeks                   6.820543e-02   4.059121  1.680301e-02
## concertsWhenever I can                    1.572409e+00   3.932301  3.998699e-01
##                                              Pr(>|t|)
## (Intercept)                              1.0000000000
## living_areaIn the outskirts of a city    0.8791954668
## living_areaIn the suburbs                0.9290773914
## living_areaIn a city                     0.2504583279
## `Lives Near Philly?`Yes                  0.0005922191
## concertsOnce a year if ever              0.5994375140
## concertsEvery 3-4 months                 0.8167349029
## concertsEvery month or every other month 0.6608346103
## concertsEvery two weeks                  0.9866105225
## concertsWhenever I can                   0.6896790533

It seems that the best predictor in this case is living near Philly! But that’s not an honest analysis, because this result is likely driven by outliers, so we’re going to toss them.

##                                               Estimate Std. Error       t value
## (Intercept)                              -1.407823e-15  1.7205786 -8.182264e-16
## living_areaIn the outskirts of a city     3.009316e-01  0.6048521  4.975292e-01
## living_areaIn the suburbs                 3.222993e-01  0.5786990  5.569377e-01
## living_areaIn a city                      8.150397e-01  0.5520281  1.476446e+00
## `Lives Near Philly?`Yes                   1.220924e+00  0.5274623  2.314714e+00
## concertsOnce a year if ever               5.119795e-01  1.8319731  2.794689e-01
## concertsEvery 3-4 months                  1.265932e+00  1.8016165  7.026646e-01
## concertsEvery month or every other month  1.698775e+00  1.8178097  9.345178e-01
## concertsEvery two weeks                   8.652422e-01  1.8676406  4.632809e-01
## concertsWhenever I can                    1.397884e+00  1.8090075  7.727353e-01
##                                            Pr(>|t|)
## (Intercept)                              1.00000000
## living_areaIn the outskirts of a city    0.61937358
## living_areaIn the suburbs                0.57820542
## living_areaIn a city                     0.14142904
## `Lives Near Philly?`Yes                  0.02166441
## concertsOnce a year if ever              0.78017970
## concertsEvery 3-4 months                 0.48309847
## concertsEvery month or every other month 0.35118676
## concertsEvery two weeks                  0.64367706
## concertsWhenever I can                   0.44060990

So it still appears that the biggest determinant of seeing Alex G concerts is if you live near Philly, which makes sense as, before Alex was touring most of his shows were in that area (and people could’ve included Skin Cells concerts as Alex G concerts). That said, if you remove the outliers, in the dataset, living near Philly is no longer as strong a predictor on how many Alex G concerts you’ve seen, but is still the biggest one.

Book and Movie Genres

I didn’t know where this section fit best, so I just put it here. I didn’t really have any hypotheses here either, but I hope you enjoy the data!

Heatmap Table of book and movie genres ; Chi-Sqare p-value of 0.301
	Crime	Drama	Fantasy	Historical Fiction	Horror	I don't read books	Mystery	Nonfiction	Poetry	Romance	Satire	Sci-Fi	Suspense/Thriller
Action	0	1	0	0	0	0	0	1	0	0	1	1	0
Adventure	0	0	1	1	0	0	0	2	0	0	0	1	0
Art House	0	3	4	3	1	1	0	13	5	0	3	4	1
Comedy	0	4	2	1	1	1	0	8	2	0	2	2	1
Documentary	0	0	0	0	0	0	1	2	3	0	2	1	0
Drama	1	14	12	3	0	6	1	8	4	1	1	4	0
Horror	1	4	1	0	2	4	0	7	2	2	2	4	2
I don’t watch movies	0	0	0	2	0	1	0	0	1	0	0	1	0
Romantic comedy	0	2	2	0	0	0	0	3	3	1	1	0	0
Suspense/Thriller	1	1	3	0	1	2	3	5	2	0	1	4	2

College/Major Analysis

Another question we asked is what type of college major did people have, or if they even decided to attend college which we look at below.

Upset plot for different majors to see what double majors we have

We have 23 people with 2 majors, 4 people with three majors, and 1 with four majors.

The most average Alex G person

So around 2020-05-08 09:57:30 a white heterosexual/straight male who is 23 years old decided to fill out the survey. They live in United States, specifically in California and when asked if they prefer multiple romantic partners they generally say no. Their favorite musician is Alex G, when you ask them their second favorite musician they’ll emphasize how much they like Alex G, before balking and saying Modest Mouse. They have a lot of opinions about Alex G, their favorite song is Snot, their favorite music video is Gretel, and their favorite album is Trick. That said, they when asked about what their favorite fan compilation is they reply, i don’t listen to them., which is probably because they listen to most of their music through spotify which is somewhat difficult to get non-official tracks on. Their favorite song is definitely Snot

Nevertheless, they’re still a big Alex G fan, they started listening to him when they were 19 years old, and have been to 2 concerts so far. When ask what they like to do in their spare time they have a couple hobbies. They like to play the Guitar, but when asked if they’re in a band, they’d say no. They also use writing to express themselves, and when it comes to concerts they go whenever i can. They are studying/have studied the Stem Field and live in a city. When it comes to weed they would say “i used to smoke weed, but now i don’t” and with alcohol they’d say they’d drink “once a month/socially”.

When it comes to their political leanings, on the scale of 1 to 10 from left politics to right, and from libertarian to authoritarian, they fall around 2 and 3 respectively. Their personality, if you asked them arbitrarily their MBTI, astrological sign, and Harry Potter House they would also tell you, respectively, INFP, Gemini, and “i don’t know”. They find all of these questions you’re asking them kinda weird, almost like this format that I decided to explain the most average Alex G fan didn’t pan out the way I wanted, but they know everyone is just trying their best.

Special Thanks!

Modmin Team of 666posting

In the order Facebook decided:

Ignacy R.
Isabelle J.
Abby C.
Armando M.
Cody G.
Ashtyn K.
Marley M.
Kara B.

Special shoutout to Isabelle for allowing and promoting the survey.

Other thanks in order of commenting on the original post

Zion H.: Inspiring the initial graph and indirectly the survey
Miguel Z.: Making the first preliminary visualizations
Alex J.: Favorite movie and book genres questions
Rivers F.: Consistent support throughout the process
Andrew W.: Consistent support as well
Fabiana O.: Suggested questions on race and queerness
Hillary N.: Suggested astrological signs questions
Luke D.: Suggested political stance questions
Gwendylan T.: Suggested weed usage and Meyer-Briggs type questions
Abigail C.: For testing the survey and giving constructive criticism
Luke P.: Allowing me to link his video

Helpful friends not of the group

Stephanie Y.: Gave critiques to the Rmarkdown and suggested spotifyr
Veronica B.: Suggested an appropriate statistical test for regional differences, and gave helpful criticisms
Kurt W. : Helpful criticisms as well.

One special thanks

Graham aka Grumpus for talking to me about Alex G back when I knew diddly squat besides Mary. Thank you for introducing me to one of my favorite artists of all time.

Things to Do

Shiny

Alex G Census 2020 Analysis

Tim Nieuwenhuis

4/30/2020

Preface

What is this?

Disclaimer

Libraries used

Seed

Demographics Analysis

The Original Plot

How long have we been listening to Alex G?

Cumulative Distribution Plot of Fan Growth

Alex G Fans Across the World

Plotting the world

North American Alex G Fans

South American Alex G Fans

South East Asia and Oceania Alex G Fans

Europe Alex G Fans

Malta!

Alex G fans in the United States

Plotting all of the US

Plotting the North East

Plotting the South

Plotting the Midwest

Plotting the West

666posting Political Compass

Spice Alert

Political plots

Musicians in 666posting

The case of the Upset Plot vs the Venn Diagram

First Upset Plot

Second Instrument plot, it contains more instruments

How many musicians are in a band?

Non-musical art

Art Upset Plot

Favorite Music/Musicians

Word Clouds

Word cloud For Favorite musician

Barplot for top 10 musicians except Alex G because he makes the plot look bad

Word cloud For Second Favorite musician

Favorite Genres

Word cloud For Favorite Genre

Favorite Genre Barplot

How did you find Alex G?

Cleaning the data so its presentable

Alex G Music

Favorite Album

Favorite Album Plot

Favorite Unreleased Album/Compilation

Favorite Music Video

Word Cloud of Favorite Alex G Songs

An analysis of favorite song vs favorite album

The spotifyr package

Alex G albums as an emotional rollercoaster

Hearing loss due to Alex G albums (loudness)

Has Alex gotten louder over time?

Energy

Danceability

Do Alex G members have a musical preference?

Personality Questions

Myers Briggs

General Population

General population letter types

Now the 666posting population

Plotting MBTI Types

Plotting Single MBTI Letter Types

MBTI Analysis

Birthday/Zodiacs

General Population

Preparing the Alex G data for the same analysis done above

Astrology Analysis

Hogwarts House

General Population

Preparing and cleaning Alex G data

Are there correlations between types?

MBTI letters vs Zodiac Signs

MBTI letters vs Zodiac Elements

Hogwarts MBTI Letter Analysis

Habits and Entertainment

Substance Usage