Antibiotic-Visualization

This is a static visualization that communicates this antibiotics data.

View the final visualization here.
View an incomplete but dynamic version here.

Technology

R for data exploration, data manipulation, and variable creation.
Tableau for parallel visualization prototyping and then heavy lifing of design.
Final touches in Figma.

Journey

I started with just building some context for the data. I aimed to visualize “how effective are these antibiotics.” After a quick google search, I learned a low minimum inhibitory concentration (MIC) value means the antibiotic was more effective, and that was enough context for the data to get going.

I started exploring the data in R. I got a rough summary of the data and found I needed to do some data manipulation. I needed to pivot the data to facet it by Antibiotic and Bacteria. Once I got this done and did some fiddling with a faceted dot plot, I had a rough idea that I wanted to encode Bacteria by MIC value faceted by Antibiotic. This new graph would aim to communicate which antibiotic was generally more effective (across the 16 bacteria).

The "faceted dot plot" and first Tableau visual

After these initial visuals, I noticed that it was still hard to compare the effectiveness of each antibiotic. A large part of this is because I had chosen to facet by Antibiotic, it was hard to compare the MIC value for the same Bacteria across the faceted data. I would have the same issue (but worse) if I had faceted Bacteria instead, so I decided to encode this comparison with color. To do this, I went back to R to generate a new variable called Relative Scores that encoded an Antibiotic’s performance compared to the others for a given Bacteria. Relative Scores took on a value of [1, 2, 3] with 1 meaning it had the lowest MIC and 3 meaning it had the most MIC. Additionally, within each facet, I sorted by MIC which to make it easier to compare in aggregate the different Antibiotic facets.

Two charts that used the new Relative Scores

Experimenting with adding a bar chart summary for Relative Scores

At this point, there was a lot of information not yet immediately visible, so I made a few changes. I changed the y-axis to use symbols because it was hard to read the vertical bacteria names, especially since sorting by MIC shuffled them. I also changed the color scheme because the Relative Score is an ordinal variable, so an ordinal color scheme (shades of blue) makes more sense than one that is more suited for nominal variables (stoplight colors).

I then added a dot plot below the bar chart for Relative Score since it was difficult to compare the aggregate Relative Score for each Antibiotic. This dual encodes Relative Score with color and position. Positioning encodes this much better, especially since sorting MIC led to a slight clustering effect in the dot plot.

Lastly, I realized the original choice I made to encode Bacteria faceted by Antibiotics on the x-axis meant that it was still really hard to do `Bacteria`-based queries like “What is the Best Antibiotic for (Bacteria)?” To solve this, I added a table to the Bacteria symbol legend to aid with this. It’s probably a design sin, but luckily I’m not a designer.

Adding symbols to encode bacteria and a secondary lookup table for bacteria-based queries

Final visualization (updated version here)

Rationale

With this visual, I attempted to answer two high-level questions: What is the relative effectiveness of each antibiotic in general? What is the comparative effectiveness of each antibiotic for each bacteria? I had four visualizations that composed the final visual. Three shared an x-axis: A bar chart of the MIC measure, a dot plot of relative MIC variable, and a bar chart of the relative MIC score count. Additionally, there was a table of the best antibiotics for each bacteria. The first three visuals answer the first high-level question by allowing comparisons between the three antibiotics overall. The last visual answers the second high-level question directly. Contextually I was uncertain where Gram Staining fit into the story I was trying to tell, so I encoded it throughout the four visuals. I will go through these four visuals sequentially.

The first bar chart was the visual that encoded the raw quantitative data of the MIC measure. On the y-axis with a log scale are the MIC scores (encoded through length). On the x-axis, the data is bacteria grouped by an antibiotic. I chose to encode the MIC measure in this way because length encodes quantitative data well. The bars have color according to a variable I created that represents the MIC needed for an antibiotic compared to the others with a value of [1, 2, 3] or equivalently [Least, Middle, Most]. Lastly, I added the total and average MIC for each antibiotic and sorted within each antibiotic group by MIC score.

The second visualization was a dot plot encoding the relative MIC variable (through position) on the y-axis and shared the x-axis with the first bar chart. The reason for encoding information already encoded in the color of the first chart is because position is a better channel than color to encode an ordinal variable. Also encoded here in symbols (checkmark or ‘X’) is the gram-straining. Note that the y-axis encodes the bacteria using symbols rather than words because each bacteria occurs three times, making it difficult to read particularly when oriented vertically. The third visual is a simple bar chart to summarize the count of this same variable.

The last visual is separate from the rest. It is a table of the bacteria names with three columns that correspond to each antibiotic. There is a symbol in each cell where an antibiotic is the best for a particular bacteria. This table also doubles as a key for the bacteria symbols. I chose to make this table because specific queries like “what antibiotic is best for bacteria X?” were hard to make using the first three visualizations. You would have to decode the bacteria symbol three times spread over the x-axis and compare the colors of the corresponding bars. This table does require decoding symbols since it doubles as a key and encodes the best antibiotic directly. Additionally, the gram staining information is encoded here for similar reasons.