Tag Archive: sentiment



Creating an Elastic Canvas for Twitter while visiting Elasticon 2018

The past week we visited Elasticon 2018 in San Francisco. In our previous blog post we wrote about the Keynote and some of the more interesting new features of the elastic stack. In this blog post, we take one of the cool new products for a spin: Canvas. But what is Canvas?

Canvas is a composable, extendable, creative space for live data With Canvas you can combine dynamic data, coming from a query against Elasticsearch for instance, with nice looking graphs. You can also use tables, images and combine them with the data visualizations to create stunning, dynamic infographics. In this blog post, we create a Canvas about the tweets with the tags Elasticon during the last day of the elastic conference last week.

Below is the canvas we are going to create. It contains a number of different elements. The top row contains a pie chart with the language of the tweets, a bar chart with the number of tweets per time unit, followed by the total tracked tweets during the second day of elasticon. The next two elements are using the sentiment of the tweets. This was obtained using IBM Watson. Byron wrote a basic integration with Watson, he will give more details in a next blog post. The pie chart shows the complete results, the green smiley on the right shows the percentage of positive tweets of the total number of tweets that we could analyze without an error or be neutral.

Overview canvas

With the example in place, it is time to discuss how to create these canvasses yourself. First some information about the installation. A few of the general concepts and finally sample code for the used elements.

Installing canvas

Canvas is part of elastic Kibana. You have to install canvas as a plugin into Kibana. You do need to install X-Pack in Elasticsearch as well as in Kibana. The steps are well described in the installation page of Canvas. Beware though, installing the plugins in Kibana takes some time. They are working on improving this, but we have to deal with it at the moment.

If everything is installed, start Kibana in your browser. At this moment you could start creating the canvas, however, you have no data. So you have to import some data first. We used Logstash with a twitter input and elastic output. Cannot go into to much detail or else this blog post will be way too long. Might do this is a next blog post. For now, it is enough to know we have an index called twitter that contains tweets.

Creating the canvas with the first element

When clicking on the tab Canvas we can create a new Workpad. A Workpad can contain one of the multiple pages and each page can contain multiple elements. A Workpad defines the size of the screen. Best is to create it for a specific screen size. At elasticon they had multiple monitors, some of them horizontal, others vertical. It is good to create the canvas for a specific size. You can also choose a background color. These options can be found on the right side of the screen in the Workpad settings bar.

It is good to know that you can create a backup of you Workpad from the Workpads screen, there is a small download button on the right side. Restoring a dashboard is done by dropping the exported JSON into the dialog.

New work pad

Time to add our first element to the page. Use the plus sign at the bottom of the screen to add an element. You can choose from a number of elements. The first one we’ll try is the pie chart. When adding the pie chart, we see data in the chart. Hmm, how come, we did not select any data. Canvas comes with a default data source, this data source is used in all the elements. This way we immediately see what the element looks like. Ideal play around with all the options. Most options are available using the settings on the right. With the pie, you’ll see options for the slice labels and the slice angles. You can also see the Chart style and Element style. These configuration options show a plus signed button. With this button, you can add options like color pallet and text size and color. For the element, you can set a background color, border color, opacity and padding

Add element

Next, we want to assign our own data source to the element. After adding our own data source we most likely have to change the parameters for the element as well. In this case, we have to change the Slice labels and angles. Changing the data source is done using the button at the bottom, click the Change Datasource button/link. At the moment there are 4 data sources: demo data, demo prices, Elasticsearch query and timeline query. I’ll choose the Elasticsearch query, select the index, don’t use a specific query and select the fields I need. Selecting the fields I need can speed up the element as we only parse the data that we actually need. In this example, we only use the sentiment label.

Change data source

The last thing I want to mention here is the Code view. After pushing the >_ Code button you’ll see a different view of your element. In this view, you’ll get a code approach. This is more powerful than the settings window. But with great power comes great responsibility. It is easy to break stuff here. The code is organized in different steps. The output of each step is, of course, the input for the next step. In this specific example, there are five steps. First a filter step, next up the data source, then a point series that is required for a pie diagram. Finally the render step. If you change something using the settings the code tab gets updated immediately. If I add a background color to the container, the render step becomes:

render containerStyle={containerStyle backgroundColor="#86d2ed"}

If you make changes in the code block, use the Run button to apply the changes. In the next sections, we will only work in this code tab, just because it is easier to show to you.

Code view

Adding more elements

The basics of the available elements or function are documented here. We won’t go into details for all the different elements we have added. Some of them use the defaults and therefore you can add them yourselves easily. The first one I do want to explain is the Twitter logo with the number of tweets in there. This is actually two different elements. The logo is a static image. The number is more interesting. This makes use of the escount function and the markdown element. Below is the code.

filters
 | escount index="twitter"
 | as num_tweets
 | markdown "{{rows.0.num_tweets}}" font={font family="'Open Sans', Helvetica, Arial, sans-serif" size=60 align="left" color="#ffffff" weight="undefined" underline=false italic=false}

The filters are used to facilitate filtering (usually by time) using the special filter element. The next item is escount which does what you expect. It counts the number of items in the provided index. You can also provide a query to limit the results, but we did not need it. The output for escount is a number. This is a problem when sending it to a markdown element. The markdown element only accepts a datatable. Therefore we have to use the function as. This accepts a number and changes it into a datatable. The markdown element accepts a table and exposes it as rows. Therefore we use the rows to obtain the first row and of that row the column num_tweets. When playing with this element it is easy to remove the markdown line, Canvas will then render the table by default. Below the output for only the first two rows as well as the changes after adding the third line (as num_tweets)

200
num_tweets #

200

Next up are the text and the photo belonging to the actual tweets. The photo is a bit different from the Twitter logo as it is a dynamic photo. In the code below you can see that the image element does have a data URL attribute. We can use this attribute to get one cell from the provided data table. The getCell function has attributes for the row number as well as the name of the column.

esdocs index="twitter*" sort="@timestamp, desc" fields="media_url" count=5 query=""
 | image mode="contain" dataurl={getCell c="media_url" r=2}
 | render

With the text of the tweet, it is a bit different. Here we want to use the markdown widget, however, we do not have the data URL attribute. So we have to come up with a different strategy. If we want to obtain the third item, we select the top 3 and from the top 3, we take the last item.

filters 
| esdocs index="twitter*" sort="@timestamp, desc" fields="text, user.name, created_at" query="" 
| head 3 
| tail 1 
| mapColumn column=created_at_formatted fn=${getCell created_at | formatdate 'YYYY-MM-DD HH:mm:ss'} 
| markdown "{{#each rows}}
**{{'user.name'}}** 

(*{{created_at_formatted}}*)

{{text}}
{{/each}}" font={font family="'American Typewriter', 'Courier New', Courier, Monaco, mono" size=18 align="right" color="#b83c6f" weight="undefined" underline=false italic=false}

The row that starts with mapColumn is a way to format the date. The mapColumn can add a new column with the name as provided by the column attribute and the value as the result of a function. The function can be a chain of functions. In this case, we obtain the column create_at of the datatable and pass it to the format function.

Creating the partly green smiley

The most complicated feature was the smiley that turns green the more positive tweets we see. The positiveness of the tweets was determined using IBM Watson interface. In the end, it is the combination of twee images, one grey smiley, and one green smiley. The green smiley is only shown for a specific percentage. This is the revealImage function. First, we show the complete code.

esdocs index="twitter*" fields="sentiment_label" count=10000 
| ply by="sentiment_label" fn=${rowCount | as "row_count"} 
| filterrows fn=${if {getCell "sentiment_label" | compare "eq" to="error"} then=false else=true}
| filterrows fn=${if {getCell "sentiment_label" | compare "eq" to="neutral"} then=false else=true}
| staticColumn column="total" value={math "sum(row_count)"} 
| filterrows fn=${if {getCell "sentiment_label" | compare "eq" to="positive"} then=true else=false}
| staticColumn column="percent" value={math "divide(row_count, total)"} 
| getCell "percent" 
| revealImage image={asset "asset-488ae09a-d267-4f75-9f2f-e8f7d588fae1"} emptyImage={asset "asset-0570a559-618a-4e30-8d8e-64c90ed91e76"}

The first line is like we have seen before, select all rows from the twitter index. The second row does kind of a grouping of the rows. It groups by the values of sentiment_label. The value is a row count that is specified by the function. If I remove all the other rows we can see the output of just the ply function.

sentiment_label         row_count
negative                32
positive                73
neutral                 81
error                   14

The next steps filter out the rows for error and neutral, then we add a column for the total number of tweets with a positive or negative label. Now each row has this value. Check the following output.

sentiment_label         row_count       total
negative                32              105
positive                73              105

The next line removes the negative row, then we add a column with the percentage, obtain just one cell and call the revealImage function. This function has a number input and attributes for the image as well as the empty or background image.

That gives us all the different elements on the canvas.

Concluding

We really like the options you have with Canvas. You can easily create good-looking dashboard that contains static resources, help texts, images combined with dynamic data coming from Elasticsearch and in the future most likely other resources.

There are some improvements possible of course. It would be nice if we could also select doc_value fields and using aggregations in a query would be nice as well.

We will closely monitor the progression as well believe this is going to be a very interesting technology to keep using in the future.