Problem retrieving data from Twitter

Ranker Predicts Jacksonville Jaguars to have NFL’s worst record in 2014

Reposted from this post on the Ranker Data Blog

Today is the start of the NFL season and building on our success in using crowdsourcing to predict the World Cup, we’d like to release our predictions for the upcoming NFL season.  Using data from our “Which NFL Team Will Have the Worst Record in 2014?” list, which was largely voted on by the community at WalterFootball.com (using a Ranker widget), we would predict the following order of finish, from worst to first.  Unfortunately for fans in Florida, the wisdom of crowds predicts that the Jacksonville Jaguars will finish last this year.

As a point of comparison, I’ll also include predictions from WalterFootball’s Walter Cherepinsky, ESPN (based on power rankings), and Betfair (basted on betting odds for winning the Super Bowl).  Since we are attempting to predict the teams with the worst records in 2014, the worst teams are listed first and the best teams are listed last.

Ranker NFL Worst Team Predictions 2014

The value proposition of Ranker is that we believe that the combined judgments of many individuals is smarter than even the most informed individual experts.  As such, the crowd sourced judgments from Ranker should outperform those from the experts.  Of course, there is a lot of luck and randomness that occurs throughout the NFL season, so our results, good or bad, should be taken with a grain of salt.  What is perhaps more interesting is the proposition that crowdsourced data can approximate the results of a betting market like BetFair, for the real value of Ranker data is in predicting things where there is no betting market (e.g. what content should Netflix pursue?).

Stay tuned til the end of the season for results.

- Ravi Iyer

Go to Source

Comments

comments

Living Room Conversations Builds Trust Across Differences Concerning CA Prison Policy

Reposted from this post on the Civil Politics Blog

At CivilPolitics, one of our service offerings is to help groups that are doing work connecting individuals who may disagree about political and moral issues.  These disagreements do not necessarily have to be about partisanship.  One organization that we work with is Living Room Conversations, a California based non-profit that holds small gatherings co-hosted by individuals who may disagree about a particular issue, in order to conciously foster non-judgmental sharing about potentially contentious issues.    Below is a description from their website, in addition to a short video.

Living Room Conversations are designed to revitalize the art of conversation among people with diverse views and remind us all of the power and beauty of civil discourse. Living Room Conversations enable people to come together through their social networks, as friends and friends of friends to engage in a self-guided conversation about any chosen issue. Typically conversations have self-identified co-hosts who hold differing views. They may be from different ethnic groups, socio-economic backgrounds or political parties. Each co-host invites two of their friends to join the conversation. Participants follow an easy to use format offering a structure and a set of questions for getting acquainted with each other and with each other’s viewpoints on the topic of the conversation.

Living Room Conversations is currently holding conversations around the issue of “realignment” in California, which is designed to alleviate prison overcrowding and where many would like to develop alternatives to jail for non-violent criminals.  Living Room Conversations wanted help understanding the effects of their program so we worked with them to develop a survey appropriate for their audience, asking people about their attitudes before and after conversations.  Informed by work in psychology, we looked at how reasonable, intelligent, well-intentioned, and trustworthy people felt after these meetings, especially toward people on the opposite side of the issue, compared to how they felt before the meeting.  Results, based on a 7-point scale, are plotted below.

LivingRoomConversationsTrust1

The fact that all scores are greater than zero means that people felt that individuals who disagreed with them on these issues were more reasonable, intelligent, well-intentioned, and trustworthy compared to how they felt before the conversation (though with a sample size of only 23 individuals so far, only the increase in trustworthiness is statistically significant).

There was still a stark difference between how people felt about those who disagreed on these issues compared to how they felt about people who they agreed with, as respondents both before and after the event felt that those they agreed with were more likely to be reasonable, intelligent, well-intentioned, and trustworthy.  As well, we asked people about their attitudes about realignment policy and people’s attitudes about the issue didn’t change.  However, civility, as we define it, is not the absence of disagreement, but rather being able to disagree in a civil way that respects the intentions of others.

Moreover, even if people’s minds hadn’t changed with respect to others, individuals felt strongly (8+ on a 10 point scale) that talking with others that hold different views is valuable.  Research on the effects of such positive contact would indicate that if these individuals do follow through on this course, they will likely end up building on these attitudinal gains toward those who disagree.  Given that, these conversations appear to be a step in the right direction.

- Ravi Iyer

Go to Source

Comments

comments

The Value of Opinion Datasets – Twitter vs. Facebook vs. Ranker vs. Market Research vs. ?

As Ranker’s Chief Data Scientist, I’ve been doing a lot of thinking of late about how much a given opinion dataset is worth.  I’m not talking about the value of a specific dataset to answer a specific question, as that varies so wildly depending on the question, but rather I’d like to consider broad datasets/categories of data that promise to satisfy the world’s thirst for opinion answers.  The existence of sites like Quora and Yahoo Answers, as well as the move by many search engines to move from providing links to pages to directly answering questions, highlights the need for such data, as does the growing demand for opinion queries.  The future of services like Siri, Cortana, and Google Now is one where one’s questions about what to buy for one’s wife, where to eat, and what to watch on TV are answered directly  and to do that well, one needs the data to answer those question.  Are the world’s data collection methodologies up to the task?

One reason I ask this is that there seems to be a misconception that large amounts of data can answer anything.  I’m a huge believer in reform in academia, but one thing my traditional academic peer-review oriented training has given me is an appreciation for that not being true.  Knowing which universities have more follows, likes, or mentions isn’t going to tell you which one has the best reputation.  Still, there certainly are advantages to scale, as well as depth.  The math behind both psychometrics and crowdsourcing tells me that no one dataset is likely to have the ultimate answers as all data has error and aggregating across that error, which is Nate Silver’s claim to fame, almost always produces the best answer.  So as I consider the below datasets, the true answer as to which you should use is “all of the above”.  That being said, I think it is helpful (at least for organizing my thinking) to consider the specific dimensions that each dataset does best.

Below I consider prominent datasets along four dimensions: sampling, scale, breadth, and measurement.  Sampling refers to how diverse and representative a set of users is used to answer a question.  Note that this isn’t essential in all cases (drug research that has life/death implications is almost always done on samples that are extremely limited/biased), and perfect sampling is almost impossible these days such that even the best political polls rely on mathematical models to “fix” the results.  Such modeling requires Scale, which is important in that it helps one find non-linear patterns in data and prevents spurious conclusions from being reached.  Related to that is Breadth as large datasets also tend to answer larger amounts of questions.  Anyone can spend the money on the definitive study of a single question at great expense, but that won’t help us for the many niche questions that exist (e.g. what is the best Barry Manilow song that I can play to woo my date?  What new TV show would my daughter like, given that she loves Elmo?).  Measurement might be the most important dimension of them all, as one can’t answer questions that one doesn’t ask well.

How do the most prominent datasets in the world fare along these dimensions?

Twitter – Sampling: C, Scale: B+, Breadth: A, Measurement: C

Twitter is great for breadth, which can be thought of not only in terms of the things talked about, which are infinite on Twitter, but also in terms of the range of emotions (e.g. the difference between awesome and cool can potentially be parsed).  There is also a lot of scale.  Unfortunately, Twitter users are hardly representative and people who tweet represent a specific group.  Measurement is very hard on Twitter as well, as there is very little context to a tweet, so one can’t tell if something is really popular or just highly promoted.  As well, natural language will always have ambiguity, especially in 140 characters (e.g. consider how many interpretations there are for a sentence like “we saw her duck”).

Facebook – Sampling: B, Scale A, Breadth, B, Measurement: D

Facebook is ubiquitous and reaches a far more diverse audience than Twitter.  People provide data on all sorts of things about all sorts of topics too.  I bought their stock because I think their data is absolutely great and still do.  Still, the ambiguity of a “like” (combined with the haphazard and ambiguous nature of relying on status updates) will mean that there will always be questions (e.g. how hated is something?  what do I think of a companies individual products?  is this movie overrated?) that can’t be answered with Facebook.

Behavioral Targeting – Sampling: B-, Scale: A, Breadth C, Measurement D

Companies like Doubleclick (owned by Google) and Gravity track your web behavior and attempt to interpret information about you based on what you do online.  They can therefore infer relationships between almost anything on the web (e.g. Mad Men interests) based on web pages having common visitors.  Yet, the use of vague terms like “interest” highlight the fact that these relationships are highly suspect.  Anyone who has looked up what these companies think they know about them can clearly see that the error rates are fairly high, which makes sense when you consider the diverse reasons we all have for visiting any website.  This type of data has proven utility for improving ad response across large groups, where the laws of large numbers means that some benefit will occur in using this data.  But I wouldn’t want to rely on it to truly understand public opinion.

Market Research – Sampling B, Scale, D, Breadth, D, Measurement A

Market research companies like Nielsen and GFK spend a lot of money to ask the right questions to the right people.    Measurement is clearly a strength as market research companies can provide context and nuance to responses as needed, asking about specific opinions about specific items in the context of other similar items.  Yet, given that only ~10% of people will answer surveys when called, even the best sampling that money can buy will be imperfect.  Moreover, these methods do not scale, given the cost, and can only cover questions that clients will pay for, such that there is no way that such methods can power the diverse queries that will go to tomorrow’s answer engines.

Ranker - Sampling B-, Scale B-, Breadth B+, Measurement A

I work at Ranker largely because I believe in the power of our platform to uniquely answer questions, even if we don’t have the scale of larger sites like Twitter and Facebook…yet (we are growing and are among the top 200 websites now, per Quantcast).  Our sample is imperfect, as are all samples, including pollsters like Gallup, but our sample is generally representative of the internet given that we get lots of traffic from search, so we can model our audience in the same ways that companies like YouGov and Google Consumer Surveys do.  The strength of our platform is in our ability to answer a broad number of specific questions explicitly and with the context of alternative choices using the list format.  Users can specifically say whether they think (or disagree) that Breaking Bad is great for binge watching, that Kanye West is a douchebag, that being smart is an important life goal, or that intelligence is a turn-on, while also considering other options that they may not have considered.

In summary, no dataset gets “A”s across the board and if I were running a company like Proctor and Gamble and needed to understand public opinion, I would use all of these methods and triangulate amongst them, as there is something to be uniquely learned from each.  That being said, I agree with Nate Silver’s suggestion to Put Data Fidelity First, and am excited that Ranker continues to collect precise, explicit answers to increasingly diverse questions (e.g. the best Doritos flavors). We are the only company that combines the precision of market research with the scale of internet polling methods, and so I’m hopeful, as our traffic continues to grow, that the value of our opinion data will continue to grow with it.

- Ravi Iyer

ps. I welcome disagreement and thoughtful discussion as I’m certain I have something to learn from others here and that there are things I could be missing.

 

 

Comments

comments

Political Discrimination as Normative as Racial Divisions once were

Reposted from this post on the Civil Politics Blog

Once upon a time, it was socially normative for society to divide itself along racial lines.  Thankfully, that time has passed and while racism still exists, it is generally considered to be a bad thing by most people in society.  The same trajectory is occurring with respect to attitudes toward homosexuals, with increased acceptance being not only encouraged, but mandated as the right thing to do.  However, in many circles, it remains normative for individuals to discriminate against those with the opposite political views.  Recent research indicates that this occurs amongst both parties.

Despite ample research linking conservatism to discrimination and liberalism to tolerance, both groups may discriminate. In two studies, we investigated whether conservatives and liberals support discrimination against value violators, and whether liberals’ and conservatives’ values distinctly affect discrimination. Results demonstrated that liberals and conservatives supported discrimination against ideologically dissimilar groups, an effect mediated by perceptions of value violations. Liberals were more likely than conservatives to espouse egalitarianism and universalism, which attenuated their discrimination; whereas the conservatives’ value of traditionalism predicted more discrimination, and their value of self-reliance predicted less discrimination. This suggests liberals and conservatives are equally likely to discriminate against value violators, but liberal values may ameliorate discrimination more than conservative values.

In addition, recent research out of Stanford University indicates that “hostile feelings for the opposing party are ingrained or automatic in voters’ minds, and that affective polarization based on party is just as strong as polarization based on race.”  Tackling this at the societal level is a daunting task for anyone, but there are things that one can do at the individual level.  Both research and practice indicates that positive relationships between individuals across such divides are likely to ameliorate such feelings.  Mixing group boundaries are likely to make competition less salient as well, perhaps allowing superordinate goals that we all share to come to the fore, as often happens when national emergencies strike.  Just as with discrimination based on race and sexual orientation, discrimination against opposing ideologies can be combated with similar techniques.

- Ravi Iyer

 

Go to Source

Comments

comments

Overcoming The Psychological Barriers to Combining Realism with Idealism

Reposted from this post on the Civil Politics Blog

I was recently forwarded this thoughtful article by Peter Wehner, from Commentary Magazine, that talks about the need for people to appreciate the importance of idealism in striving for policy goals as well as the realism of compromise with others who also have valid parts of the truth.  From the article:

Politics is an inherently messy business. Moreover, the American founders–who developed the concepts of checks and balances, separation of powers, and all the rest–wanted politics to be messy. …

Too often these days, zealous people who are in a hurry don’t appreciate that the process and methods of politics–the “messy,” muddling through side of politics–is a moral achievement of sorts. But this, too, is only part of the story.

The other part of the story is that justice is often advanced by people who are seized with a moral vision. They don’t much care about the prosaic side of governing; they simply want society to be better, more decent, and more respectful of human dignity. So yes, it’s important not to make the perfect the enemy of the good. But it’s also the case that politics requires us to strive for certain (unattainable) ideals….

What happens all too often in our politics is that people who are drawn to one tend to look with disdain on those who are drawn to the other. What we need, I think, is greater recognition that both are necessary, that each one alone is insufficient. Visionaries have to find a way to give their vision concrete expression, which requires deal-making, compromise, and accepting something less than the ideal. Legislators need to govern with some commitment to philosophical and moral ideals; otherwise, they’re just passing laws and cutting deals for their own sake.

Unfortunately, moral conviction is often negatively correlated with appreciating the need for compromise.  How then can we combine realism with idealism?  We here at CivilPolitics are actively supporting research to help understand how to remove these barriers to groups coming together despite moral disagreements and welcome contributions from academics who have good ideas.  Some ideas that have support in the research include improving the personal relationships between groups and introducing super-ordinate goals where moral agreement can occur.  In future months, we’ll be highlighting other recommendations along these lines to help combine realism with idealism.

- Ravi Iyer

 

 

 

Go to Source

Comments

comments

CivilPolitics.org comments on Hollande’s Political Strategy for BBC World

Reposted from this post on the Civil Politics Blog

Earlier today, I appeared on BBC World’s Business Edition to comment on Francois Hollande’s efforts to unite union and business interests in working to improve the lagging French economy.  I provided the same advice that I often do to groups that are looking to leverage the more robust findings from social science in conflict resolution, specifically that rational arguments only get you so far and that real progress is often made when our emotions are pushing us toward progress, as opposed to working against us.  Accordingly, it often is better to try to get the relationships working first, in the hopes that that opens minds for agreement on factual issues.  As well, it is often helpful to emphasize super-ordinate goals, such as improving the economy as a whole in this case, as opposed to competitive goals such as hiring mandates.  Lastly, hopefully Hollande, as a socialist who is fighting for business interests, can help muddy the group boundaries that can make conflicts more intractable, providing an example of someone who is indeed focused on shared goals.

Below is the segment, and my appearance is about 2 minutes into the video.

- Ravi Iyer

Go to Source

Comments

comments

Pew Research highlights Social, Political and Moral Polarization among Partisans, but more people are still Moderates

Reposted from this post on the Civil Politics Blog

A recent research study by Pew highlights societal trends that have a lot of people worried about the future of our country.  While many people have highlighted the political polarization that exists and others have pointed to the social and psychological trends underlying that polarization, Pew’s research report is unique for the scope of findings across political, social, and moral attitudes.  Some of the highlights of the report include:

  • Based on a scale of 10 political attitude questions, such as a binary choice between the statements “Government is almost always wasteful and inefficient” and  ”Government often does a better job than people give it credit for”, the median Democrat and median Republicans’ attitudes are further apart than 2004 and 1994.
  • On the above ideological survey, fewer people, whether Democrat, Republican, or independent, are in the middle compared to 1994 and 2004.  Though it is still worth noting that a plurality, 39% are in the middle fifth of the survey.
  • More people on each side see the opposing group as a “threat to the nation’s well being”.
  • Those on the extreme left or on the extreme right are on the ideological survey are more likely to have close friends with and live in a community with people who agree with them.

 

The study is an important snapshot of current society and clearly illustrates that polarization is getting worse, with the social and moral consequences that moral psychology research would predict when attitudes become moralized.  That being said, I think it is important not to lose sight of the below graph from their study.

 

Pew Survey Shows a Shrinking Plurality holds Moderate Views

Pew Survey Shows a Shrinking Plurality holds Moderate Views

 

Specifically, while there certainly is a trend toward moralization and partisanship, the majority of people are in the middle of the above distributions of political attitudes and hold  mixed opinions about political attitudes.  It is important that those of us who study polarization don’t exacerbate perceived differences, as research has shown that perceptions of differences can become reality.  Most Americans (79%!) still fall somewhere between having consistently liberal and consistently conservative attitudes on political issues, according to Pew’s research.  And even amongst those on the ends of this spectrum, 37% of conservatives and 51% of liberals have close friends who disagree with them.  Compromise between parties is still the preference of most of the electorate.  If those of us who hold a mixed set of attitudes can indeed make our views more prominent, thereby reducing the salience of group boundaries, research would suggest that this would indeed mitigate this alarming trend toward social, moral, and political polarization.

- Ravi Iyer

Go to Source

Comments

comments

Comparing World Cup Prediction Algorithms – Ranker vs. FiveThirtyEight

Reposted from this post on the Ranker Data Blog

Like most Americans, I pay attention to soccer/football once every four years.  But I think about prediction almost daily and so this year’s World Cup will be especially interesting to me as I have a dog in this fight.  Specifically, UC-Irvine Professor Michael Lee put together a prediction model based on the combined wisdom of Ranker users who voted on our Who will win the 2014 World Cup list, plus the structure of the tournament itself.  The methodology runs in contrast to the FiveThirtyEight model, which uses entirely different data (national team results plus the results of players who will be playing for the national team in league play) to make predictions.  As such, the battle lines are clearly drawn.  Will the Wisdom of Crowds outperform algorithmic analyses based on match results?  Or a better way of putting it might be that this is a test of whether human beings notice things that aren’t picked up in the box scores and statistics that form the core of FiveThirtyEight’s predictions or sabermetrics.

So who will I be rooting for?  Both methodologies agree that Brazil, Germany, Argentina, and Spain are the teams to beat.  But the crowds believe that those four teams are relatively evenly matched while the FiveThirtyEight statistical model puts Brazil as having a 45% chance to win.  After those first four, the models diverge quite a bit with the crowd picking the Netherlands, Italy, and Portugal amongst the next few (both models agree on Colombia), while the FiveThirtyEight model picks Chile, France, and Uruguay.  Accordingly, I’ll be rooting for the Netherlands, Italy, and Portugal and against Chile, France, and Uruguay.

In truth, the best model would combine the signal from both methodologies, similar to how the Netflix prize was won or how baseball teams combine scout and sabermetric opinions.  I’m pretty sure that Nate Silver would agree that his model would be improved by adding our data (or similar data from betting markets that similarly think that FiveThirtyEight is underrating Italy and Portugal) and vice versa.  Still, even as I know that chance will play a big part in the outcome, I’m hoping Ranker data wins in this year’s world cup.

- Ravi Iyer

Go to Source

Comments

comments

Intuitionism in Practice: How the Village Square puts Relationships First

Reposted from this post on the Civil Politics Blog

Our friends at the Village Square recently wrote an article about how they have been able to bridge partisan divides in their community, based on their experiences at numerous community dinners they put on in their neighborhoods.  Their experience dovetails nicely with what has been found in academic psychology, specifically that any type of attitude change requires appealing to the intuitive side of individuals, in addition to the rational side.  Accordingly, their “irreverently named programs are part civic forum, part entertainment” where they seek first to build relationships to open people’s minds, before attempting to get people to rationally understand the other sides’ arguments.  From the article:

In “The Big Sort: Why the Clustering of Like-minded America is Tearing Us Apart,” Bill Bishop documents how, in nearly all aspects of life, we’ve become less connected to those who don’t share our views – in the churches we go to, the clubs we join, the neighborhoods we live in.

No longer engaging across the aisle with neighbors, there’s little to mitigate the human tendency toward tribalism. Once we’ve demonized each other, the simple act of talking is tantamount to negotiating with evil.

To address this challenge, our irreverently named programs are part civic forum, part entertainment. Each event is casual (the stage is set up to feel like the facilitator’s living room) and involves sharing food. As we begin, we give out two “civility bells,” ask that the audience avoid tribal “team clapping,” and share a quote to inspire our better angels. We welcome fluid audience participation and always try to laugh.

Since we first imagined The Village Square, we have repeatedly returned to the same conclusion: We can’t wait around for Washington to lead on this. It’s in our hometowns, where we carpool to softball games and borrow cups of sugar, where we can most easily have the conversations democracy requires of us.

Recently, there has been a lot of re-examination of social science findings that may or may not replicate, especially in real-world environments.  The fact that social science research that emphasizes the importance of personal relationships in changing attitudes has found real world application and validation is comforting for those of us who would like to leverage this research in reducing morally laden conflicts.  Those of us who would like to mitigate the natural animosity that arises when competing groups are formed would do well to follow the Village Square’s lead and put relationships first.

- Ravi Iyer

Go to Source

Comments

comments

Cantor Loss shows Crowdsourcing, not Polling, is the Future of Prediction

Eric Cantor, the 2nd most powerful person in the House of Representatives, lost in the Republican Primary today to the relatively unknown Dave Brat. While others have focused on the historical nature of the loss, given Cantor’s position in his party, or on the political ramifications, I was most intrigued by the fact that polls conducted recently predicted Cantor would win by 34 points or 12 points.  In the end, Cantor lost by more than 10 points.

How did the polls get it so wrong?  In an age where people are used to blocking out web ads, fast forwarding through commercials, and screening their calls, using automated phone technology to ask people who they will vote for and assuming that you’ll get an unbiased sample (e.g. people who answer such polls don’t differ from those who do not answer automated polls) seems unwise.  The first banner ad got 44% clickthrough rates, but now banner ads are only clicked on by a small minority of internet users.  As response rates fall, bias is inevitable.

Pollsters may try to weight their polls and use new techniques to produce more perfect polls, but non-response bias will only get worse as consumers learn to block out more and more solicitations using technology.  On the other hand, a good crowdsourcing algorithm, such as the algorithm we use to produce Ranker lists, does not require the absence of bias.  Rather, such an algorithm will combine multiple sources of information, with the goal being to find sources of uncorrelated error.  In this case, polling data could have been combined with the GOP convention straw pollthe loss of one of his lieutenants in an earlier election, and the lack of support from Republican thought leaders, to form a better picture of the election possibilities as the non-response bias in regular polling is a different kind of bias than these other measurements likely have, and so aggregating these methods should produce a better answer.

This is easy to say in hindsight and it is doubtful that any crowdsourcing technique could have predicted Cantor’s loss, given the available data.  But more and more data is being produced and more and more bias is being introduced into traditional polling, such that this won’t always be the case, and I would predict that we will increasingly see less accurate polls and more accurate use of alternative methods to predict the future.  The arc of history is bending toward a world where intelligently combining lots of imperfect non-polling measurements are likely to yield a better answer about the future than one attempt to find the perfect poll.

- Ravi Iyer

Comments

comments