Analysis of a Personal Public Talk

Heart Rate

It’s always a fun exercise to monitor your heart-rate in uncommon, out-of-the-comfort-zone events. We are often aware of our state, we feel the stress, agitation, and palpitation! But we are most likely to lose the focus on such internal state eventually, cause something else requires or shifts out attention.

Plot of my Fitbit heart-rate measurements for 2016–11–22. One entry per minute
  • A: arrived at the location of the event (preceded by a 15 minutes fast-walk)
  • B: start of the first speaker talk. Here I’m just sitting and listening, also most likely consciously stressing myself by trying to relax myself
  • C: on the stage, start of my talk
  • D: start of the Q&A session
  • E: back to the chair

Speech Analysis

For the speech analysis, I am going to analyze only the actual transcripts of my talk. A lot of data will be lost cause of this decision, and I’m not talking just about possible inaccuracy of speech-to-text results… here a list of important aspects of public speaking which are lost when considering only a basic textual representation:

Speech To Text

First thing first: speech-to-text. I didn’t do it on the spot, using tools for real-time generation of text, I instead relied on the video recordings setup for the event.
I got my video file, extracted the audio content, cropped the unnecessary parts (including Q&A session) and went for a speech-to-text solution.

Basic Text Analysis

Considering the pure textual info I already ended up with this basic but neat summary of my talk:

Words Alignment and Speech Rate

Let’s consider again the speech-to-text results from IBM Watson service, here what the first five rows (out of 3315, one for each word) of the cleaned results look like:

Histogram for binning on time_end variable. Equivalent to a binned word count
  • using fewer words, which can be caused by two factors: more spacing/pauses between words or usage of longer words, which take more time to be pronounced
  • lower word speed (time to pronounce a word, measured as length of word w divided by time to pronounce w)
Scatterplots for each derived measures. X axis is the bin index
Violinplot showing the distribution of %HESITATION occurrences



Data Scientist @ Zalando Dublin - Machine Learning, Computer Vision and Everything Generative ❤

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alex Martinelli

Alex Martinelli

Data Scientist @ Zalando Dublin - Machine Learning, Computer Vision and Everything Generative ❤