I think what you have done is fascinating overall. However I felt this post was a bit of a tangle and didn’t get the ‘so what’. Could you please think about a summary that either says: why what I found matters or why what I found interested me.
Trans women are female. Trans men are male. Why the hell do you always group trans men with cis women and trans women with cis men in your studies? Why do you only separate nonbinary people by sex assigned at birth? Why do you use "transwomen" and "transmen", classic TERF language?
I would be very curious to see this for age of transition. Do trans women that underwent female puberty are closer to cis women or non-cis people that transitioned later in life? This, I think, would help answer the question if those differences are due to socialisation or physiology.
Amazing sample you are working with. Huge props for collecting it. For the notion of similarity I would have zero-meaned the scores before taking the correlation between groups. What I mean by that is you have a 1-100 rating on something rare or very bad (eg, I am a serial killer), then almost everyone will score 100. When you take the correlation, that question will tend to count more than something where people give a range of responses and the mean is near 100.
You end up getting a lot more separation between the groups, because you are comparing how they differ from the global average. So instead of getting correlations between .9 and 1 for bio_males I got their correlations to be: 1.00, -0.21, 1.00, 0.02, 0.13, -0.90, 0.24, -0.88. The corresponding labels are bio_male correlated with: biomale, biofemale, cisbiomale, cisbiofemale, queerbiomale, queerbiofemale, ebybiomale, enbybiofemale. It's interesting that you can see bio_males and bio_females are a bit different (-0.21), but queerbiofemale is basically the opposite of bio_male (-0.9). As if the identity is to be what straight males are not. (Granted, I did not drop any of the questions specifically about gender. You can see my work at https://docs.google.com/spreadsheets/d/1eaosAMYxjVTi-nUkxWZ10K3teM0X9z4NEwv13T5K7kc/edit?usp=sharing )
There are some more complicated notions of similarity which wouldn't be so hard to implement. Like for example train a classifier to go from personality -> gender. I have done this on a large (100k) sample of Big Five data and gotten ~80% accuracy. The errors are typically pretty informative when displayed as a confusion matrix: for each class show how often is was misclassified as each other class. This has the advantage of modeling nonlinear relationships in the data, as you can use a neural net or random forest as the classifier
My intuition would be that these results are dominated by gender progressivism, as women are more progressive than men and trans people are more progressive than cis people.
I think what you have done is fascinating overall. However I felt this post was a bit of a tangle and didn’t get the ‘so what’. Could you please think about a summary that either says: why what I found matters or why what I found interested me.
Thanks
Trans women are female. Trans men are male. Why the hell do you always group trans men with cis women and trans women with cis men in your studies? Why do you only separate nonbinary people by sex assigned at birth? Why do you use "transwomen" and "transmen", classic TERF language?
I would be very curious to see this for age of transition. Do trans women that underwent female puberty are closer to cis women or non-cis people that transitioned later in life? This, I think, would help answer the question if those differences are due to socialisation or physiology.
Amazing sample you are working with. Huge props for collecting it. For the notion of similarity I would have zero-meaned the scores before taking the correlation between groups. What I mean by that is you have a 1-100 rating on something rare or very bad (eg, I am a serial killer), then almost everyone will score 100. When you take the correlation, that question will tend to count more than something where people give a range of responses and the mean is near 100.
You end up getting a lot more separation between the groups, because you are comparing how they differ from the global average. So instead of getting correlations between .9 and 1 for bio_males I got their correlations to be: 1.00, -0.21, 1.00, 0.02, 0.13, -0.90, 0.24, -0.88. The corresponding labels are bio_male correlated with: biomale, biofemale, cisbiomale, cisbiofemale, queerbiomale, queerbiofemale, ebybiomale, enbybiofemale. It's interesting that you can see bio_males and bio_females are a bit different (-0.21), but queerbiofemale is basically the opposite of bio_male (-0.9). As if the identity is to be what straight males are not. (Granted, I did not drop any of the questions specifically about gender. You can see my work at https://docs.google.com/spreadsheets/d/1eaosAMYxjVTi-nUkxWZ10K3teM0X9z4NEwv13T5K7kc/edit?usp=sharing )
There are some more complicated notions of similarity which wouldn't be so hard to implement. Like for example train a classifier to go from personality -> gender. I have done this on a large (100k) sample of Big Five data and gotten ~80% accuracy. The errors are typically pretty informative when displayed as a confusion matrix: for each class show how often is was misclassified as each other class. This has the advantage of modeling nonlinear relationships in the data, as you can use a neural net or random forest as the classifier
You're missing the category name "cis man" in second table.
My intuition would be that these results are dominated by gender progressivism, as women are more progressive than men and trans people are more progressive than cis people.