Text Analysis using Elastic Search/Lucene


3.2.1 What is Text Analysis?

Analysis is the secret sauce in elasticsearch’s ability to deal with natural language and other complex data. Elasticsearch has a large toolbox with which we can slice and dice words in order to efficiently searched.  3,652 more words

Sentiment analysis: slides from TechniCity

Screenshotted slides from a lecture on microparticipation by Jennifer Evans-Cowley (Ohio State University), on week 8 of Coursera’s TechniCity MOOC.

Big Data and Preventive Government: A Review of Joshua Mitts' Proposal for a "Predictive Regulation" System

In Minority Report, Steven Spielberg’s futuristic movie set in 2050 Washington, D.C., three sibling “pre-cogs” are hooked up with wires and stored in a strange looking kiddie pool to predict the occurrence of criminal acts. 889 more words

Mechanics of Text Analysis II

In my previous post I introduced a process that students can use to learn how to analyze texts. Here is a specific example. The text is Woodrow Wilson’s Fourteen Points. 400 more words

Creating a Density Map in R with Zipcodes

Though not specifically geared towards text analysis I thought that this tutorial would be helpful to anyone. When I had to learn how to create these maps using R there was no thoroughly comprehensive how-to guide. 403 more words


Word cloud of school choice interviews of regionally (non-metro) based parents

Out of curiosity, below is a word cloud of the 4 school choice interviews of parents from regional Victoria, Australia.

It will be interesting to see how word cloud visualizations compare to more sophisticated linguistic analysis techniques.

AlchemyAPI points its deep learning service on ad inventory

Deep-learning-as-a-service startup AlchemyAPI has launched a new product aimed at advertisers that want to better classify their content and turn it into targeted ad inventory. Automation is probably the major benefit of deep learning technologies right now because users are able to tag and classify text and images without using many human resources and with fairly high confidence the results will be accurate. 262 more words

