Five ways that technology will democratize social science
I currently work as both a researcher at USC and as the Director of Data Science at Ranker.com. Some people would consider these two roles to be somewhat tangential, but increasingly, I'm finding that there is a lot of overlap. Technological methods are increasingly of use in social science at the same time as social science methods are being imported into technology companies. Increasingly, companies are trying to create statistical models to predict behavior. As more and more data on human behavior and thought is collected by technology companies, as opposed to university researchers, it seems inevitable that social science itself will be changed.
Technology has not just changed, but disrupted, every other dominant form of information distribution that previously existed, be it the distribution of music (iTunes), news (Huffington Post), books (Amazon), TV (Hulu), gossip (Twitter), jokes (Cheezburger), language (c u l8r), family news (Facebook), and education (TED talks or the Khan Academy). While academia is called the ivory tower for a reason, it seems unlikely that it will escape this wave of change, especially given the fact that the biggest technology companies collect far more data on human thought and behavior in a day than all of academia collects in a year.
Here are five specific ways that I believe technology will change social science:
1. Bigger ecologically valid, data sets - The only thing that separates social science from opinion is the use of data and with more data comes more confident findings. There is currently some debate in social psychology as to methodology that sometimes can lead to false positive results, by taking advantage of chance. For example, statistical significance is defined, in many sciences, as something that has a 95% chance of being correct, which sounds impressive, but if 200 researchers want to prove something, this means that 10 of them will be able to, by sheer chance. As data sets get bigger and bigger, the chance of error will become lower and lower, with standards for "significance" getting more and more stringent. In addition, most of this new data will be collected in real world environments, meaning that there will be less of a logical leap when inferring some real world phenomena that relates to the results of a lab study.
2. Cross-sample Validation - With more data comes the possibility of dividing a dataset into many parts (e.g. by referral URL) and replicating research in many datasets. To do this efficiently will require a technology we use a lot at Ranker, the semantic web. The semantic web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. Right now, researchers cross-validate findings through a painstaking process called meta-analysis, whereby interested parties attempt to reconcile various datasets into a standard format. Of course, because there are no standards, each reconciliation process is a one-time throwaway process, whereby all the extracted data from one researcher is unusable by subsequent researchers. Google scholar is starting the process of taking "dumb text" in papers and creating some metadata as my scholar page contains extractions of dates, authors, and citations from papers I've written. If there were a standard format for describing the data within the papers, there is no reason why those couldn't be extracted as well, allowing us to answer questions like "what is the correlation between openness to experience and ideology in data collected by people before and after 2005?" without having to read all those papers. It would also let people share simple findings like "the correlation between openness to experience and ideology at place X and time Y is Z," which are completely lost now.
3. Adding inter-disciplinary analysis and variables - Right now, social science is balkanized. Every discipline has it's own methodologies and opinions about what is or is not the right way to do things. Personality psychologists care a lot about measurement while political scientists care about sampling. Social psychologists create brilliant artificial controlled lab experiments designed to isolate variables, while technology companies mine free form, uncontrolled data seeking exploratory patterns. Qualitative methods have a richness and depth, that is scoffed at by more quantitative researchers. All of these methodologies have error and the methodologies of any discipline share error, such that they all would be improved by adding the techniques of other disciplines. But as long as there are no standards for data (e.g. the semantic web), reconciling this data would require immense human effort. Further, the lack of standards means that we never have the full picture of human thought and behavior. Psychologists may study risk tolerance and variable A and financial analysts may study risk tolerance and variable B, which might lead to a natural hypothesis as the relationship between A and B. But since psychologists are not interested in B and financial analysts do not care about A, nobody reconciles this data. Real world human behavior usually involves the variables in ALL disciplines, yet each discipline often contents itself with it's own slice of a human being. Semantic technology will eventually allow us to put these slices together.
4. Systems level approaches - Of course, putting together the results of semantic datasets, which combine hundreds of variables and many bi-directional connections, all with varying degrees of confidence arrived at through various methodological and sampling techniques, is not easy using the traditional paper format. The end result of such an approach is often a system or a model, of the type that computer scientists build, rather than a paper. Some psychologists are putting together connectionist models, but the expertise to actually do such things lies in technology circles more than in the social science community.
5. An open knowledge base - The internet hates middlemen, and right now, academic publishers are middlemen who control the flow of information under the outdated idea that people read printed editions of journals devoted to specific limited topics with limited pages. The noble goal of the editorial process is to separate truth from untruth through peer review, which is a laudable, but completely impractical goal, as evidence exists along a continuum instead of being categorically true or untrue. There are so many peer-reviewed journals that anything can get the stamp of "truth". Unlike physics or chemistry, a single paper's worth of evidence, no matter where it appears, is never conclusive in social science. Big controversies exist in social science even about things where there are tons of very well-done papers about the subject, each of which is ostensibly the truth, or else it shouldn't have been published, right? The reality of social science is that best we can do is to sum up all the evidence from all the various data collected, hopefully using various methodologies (again, something the semantic web can solve), and get a bigger picture of how robust any finding is. However, since peer review checks for importance, topicality, novelty, and a host of other subjective factors, not to mention a journal's bias against replications and null findings, the current process actually ends up hiding the true sum of all evidence for any finding. That is how prominent blatantly false findings can exist in the literature for years undetected. Further, since journals require high subscription fees from universities (whose employees do all the work for the journal ironically), only people at first world universities can even see this evidence. Whether you agree with my hypothesis or not, the current system is simply unsustainable given the mountain of data that is coming and the ethos of silicon valley, where publish then review/filter/aggregate is the dominant model. As more and more data on human behavior and thought is published by companies like Hunch, Ok Cupid, Ranker and the Facebook data team, the traditional social science system will necessarily adapt to these methods or become largely irrelevant next to these larger, more ecologically valid, robust, and complex datasets.
In summary, social scientists are incredibly smart about what they do, most moreso than I, and there is a lot that technologists can learn from social science methods. Indeed, on March 11, I'll be giving a talk at SXSW about how much technologists can benefit from social science methods, especially as it relates to serving the intangible needs of employees and customers.
However, there are countless ways that social scientists can benefit from technology as well. Human beings have been studying the human condition for thousands of years, and the idea that a select group of humans can use their special methodology to go off into an ivory tower, figure things out, and then inform the rest of us what the truth is, is an unlikely scenario. Or perhaps more correctly, it is a common scenario that has played out throughout history with no actual impact on our collective understanding. If we really want to make an impact on our collective understanding of ourselves, it will take a collective effort from social scientists and internet professionals, quantitative and qualitative researchers, novelists and political scientists, and including the kid who surveys their 3rd grade class whose data contributes to our collective understanding too. It is my proposition that technology, and specifically the semantic web, may finally allow such a collaboration to occur.
- Ravi Iyer
Comments
Main Themes of This Blog
- •Post-Materialism: People are increasingly motivated by values and higher order psychological needs.
- •Book Reviews – Consilience between psychology and books I read.
- •Hypermoralism – Morality causes ordinary people to do immoral things.
- •What are the psychological differences that make people liberal democrats, conservative republicans, or libertarians?
Vote for the Best Psych Books
Categories
- book reviews
- business of psychology
- civil politics
- consilience
- consumer psychology
- data science
- differences between republicans and democrats
- drug laws
- gross domestic product
- hypermoralism
- justice and fairness
- libertarians
- Links
- main themes of this blog
- misc
- moral confabulation
- moral confabulation in the news
- moral emotions
- moral foundations
- moral imagination
- moral psychology
- news commentary
- political psychology
- positive psychology
- Post Materialism
- ranker
- replications of other studies
- technology business
- the old polipsych
- unpublished results
- War and Peace
- yourmorals.org
Blogroll
- AboutMyJob.com
- Consumer Psychology Self-Tests @ Beyond The Purchase.Org
- Pilates Anytime – Online Pilates Classes
- Ranker Votable Lists
- Ranker's Data Blog
- Tal Yarkoni's Psychology/Informatics Blog
- Tara Met Blog
- The Music is Over – Musician Obituaries
- YourMorals.org
Explore
academia
aggression
big 5
big data
civility
coherence
conservatives
consilience
differences between liberals and conservatives
disgust
empathy
equality
equity
fairness
hypermoralism
idealistic evil
incivility
jon stewart
liberals
liberals and conservatives
libertarians
libya
mitt romney
moral absolutism
moral foundations
moral maximizing
moral psychology
neuroticism
new york times
obama
openness to experience
partisanship
peace
peer review
personality traits
political psychology
religion
romney
social dominance orientation
social psychology
stephen colbert
sxsw
technology
votehelp.org
war book reviews (10)
business of psychology (17)
civil politics (16)
consilience (18)
consumer psychology (7)
data science (3)
differences between republicans and democrats (20)
drug laws (3)
gross domestic product (1)
hypermoralism (11)
justice and fairness (6)
libertarians (9)
Links (1)
main themes of this blog (4)
misc (1)
moral confabulation (10)
moral confabulation in the news (8)
moral emotions (3)
moral foundations (4)
moral imagination (2)
moral psychology (28)
news commentary (47)
political psychology (70)
positive psychology (13)
Post Materialism (7)
ranker (5)
replications of other studies (8)
technology business (1)
the old polipsych (4)
unpublished results (26)
War and Peace (7)
yourmorals.org (84)
WP Cumulus Flash tag cloud by Roy Tanck and Luke Morton requires Flash Player 9 or better.
Archive
- May 2013
- April 2013
- March 2013
- December 2012
- November 2012
- October 2012
- September 2012
- August 2012
- July 2012
- June 2012
- May 2012
- April 2012
- March 2012
- February 2012
- January 2012
- December 2011
- November 2011
- October 2011
- September 2011
- August 2011
- July 2011
- June 2011
- May 2011
- April 2011
- March 2011
- February 2011
- January 2011
- December 2010
- November 2010
- October 2010
- September 2010
- August 2010
- July 2010
- June 2010
- May 2010
- April 2010
- March 2010
- February 2010
- January 2010
- December 2009
- November 2009
- October 2009
- September 2009
- April 2009
- September 2008
- July 2008
- June 2008
- March 2008
- February 2008
- November 2007
- October 2007
- September 2007
- June 2006
- May 2004
- April 2004
Consumer Psychology Posts
- If You?re Happy and You Know It, Check Your Text
- The Costs and Benefits of ?Living for Now?
- You are Not That Great
- Money and Happiness: The Costs and Benefits of Living for Now
- The First International Day of Happiness
Last 10 Posts:
- May 7, 2013
Personality Types in Business: Conscientious CEOs & Open Technologists - April 25, 2013
Big Data Stocks? Invest in Data, not in Tools. - April 4, 2013
The Moral Foundations of Environmentalists - March 26, 2013
Your Values Predict the Stories You Choose - December 14, 2012
How to Prevent Mental Illness: Help others with their stressful life events - November 24, 2012
When is investment banking immoral? A review of Greg Smith’s book, Why I left Goldman Sachs. - November 21, 2012
On Mitt Romney and The X-Files - November 18, 2012
The Gaza Conflict and Being Pro-Peace rather than Anti-War - November 8, 2012
Bill O’Reilly, Sarah Palin and Paul Krugman need to get out of Maslow’s Basement. - November 5, 2012
Early Voting is a Social Influence Tool, so tell everyone when you vote!
Civil Politics Posts
- The Driven Snowe: Centrist as Outsider May 17, 2013 Beau Lebette
- Millennials: Not Immune to Extreme Partisanship May 8, 2013 Beau Lebette
- A Civil Exploration of Religion May 7, 2013 Connor Wood
- Does President Obama Golf Enough? April 30, 2013 Beau Lebette
- Ever Redder More Truly Blue: The Fate of States April 25, 2013 Beau Lebette
Popular Search Terms
- libertarian psychology
- Brother and sister incest story
- Brother sister incest stories
- real brother sister incest stories
- real brother sister incest story
- http://www polipsych com/2010/12/29/tony-washington-brother-sister-incest/
- hypermoralism
- examples of limitations in psychology experiments
- brother sister incest true stories
- the differences between gross national happiness and gross domestic product
March 6th, 2012 - 12:39
An additional disruptive vector that technology brings to academia is the more ubiquitous availability of books, journals, and other reference material. One reason universities were the “ivory tower” was because you had to be there to access the library. The better universities had better libraries. Better libraries meant better research. With the ability to access much more by electronic means, the need to be physically located with the library is lessened.
In short, it is now more possible for a academic researcher to conduct and publish serious work without having to be on the faculty of the top schools.
March 17th, 2012 - 15:14
Re your point 5: The hypothes.is effort is along the lines of resolving your complaints about the current peer review paradigm. They anticipate getting a working site up late this year. I think the team is very bright, with an excellent chance of success.
Re: your point 2: The known interdisciplinary needs of science have always been underexplored, but now they are exploding in both volume and urgency, especially in social sciences, as various challenges converge on the modern world rapidly. There are exciting hints of what’s possible, thanks to all the information from people like you that get it over levee walls via meta-analysis and however else they can. This effort of mine, to provide a simple guidance for liberals to understand and speak with conservatives, has been a nightmare of balkanized, hidden data, contradictions, $35 research papers, parallel semantics, and just plain effectively/empirically stupid, empirically evil, culturally-mandated disregard of what’s going on down the hall. A tremendous amount of money and time wasted- but the real loser, as in my output’s case, is Joe and Jane wait for decades to get something basic done, while the data or techniques have been knocking around in 14 different corners of academia, waiting on efficiency, synergy, standardization, or consensus to become useful output. The loser is the guy trying to raise his kid, or trying to eat the right things, or trying to not argue with his uncle.
The word in the business world for this kind of middleman elimination is ‘disintermediation.’ A good word, and a general, base concept for social science in my opinion, especially in the context of information displacements. Socially, there are enormous ignored impacts to disintermediation- exhibit 1 and 2 are the issues that OWS and the Tea Party are hit-and-missing on the subject. This academic version of disintermediation you’re championing is no exception, with challenges that look akin to journalism’s, other information-based leveling, and some of globalism’s more general problems: who are the new experts, how do we know them (reputation management)- and how do we grow a quality expert base- how do they get paid, since we’re tearing down crystal-clear hierarchies and replacing them with anarchic structures that are far more efficient? How do we continue to do large-scale, high-quality effort as this leveling occurs? How do we take care of the ‘old experts’, so that they don’t fight the effort, and so they continue to maximize their contribution? I’m not opposed to your desires- quite the contrary- but I’ve lived the change management side of this battle for a couple of decades now. This is not Encyclopedia Brittanica becoming Wikipedia: it requires a leveling and standardization of a much more problematic subset of the knowledgebase. We’re essentially saying we want to democratize (publicize cheaply) large amounts of proprietary data and intellectual property, craft standard metadata, destroy or radically modify the workup methodology for the data (generating product for journal placement). And hey, here’s our metametadata thinking so far…it’ll be interesting to see how hypothes.is does in their pro publica effort, and how subversive it ends up.
March 18th, 2012 - 21:56
I agree with Scott