Understanding the nuances of speech emotion dataset curation and labeling is important for assessing speech emotion recognition (SER) mannequin potential in real-world purposes. Most coaching and analysis datasets comprise acted or pseudo-acted speech (e.g., podcast speech) during which emotion expressions could also be exaggerated or in any other case deliberately modified. Moreover, datasets labeled primarily based on crowd notion usually lack transparency relating to the rules given to annotators. These elements make it obscure mannequin efficiency and pinpoint vital areas for enchancment. To handle this hole, we recognized the Switchboard corpus as a promising supply of naturalistic conversational speech, and we educated a crowd to label the dataset for categorical feelings (anger, contempt, disgust, concern, unhappiness, shock, happiness, tenderness, calmness, and impartial) and dimensional attributes (activation, valence, and dominance). We seek advice from this label set as Switchboard-Have an effect on (SWB-Have an effect on). On this work, we current our strategy intimately, together with the definitions offered to annotators and an evaluation of the lexical and paralinguistic cues which will have performed a task of their notion. As well as, we consider state-of-the-art SER fashions, and we discover variable efficiency throughout the emotion classes with particularly poor generalization for anger. These findings underscore the significance of analysis with datasets that seize pure affective variations in speech. We launch the labels for SWB-Have an effect on to allow additional evaluation on this area.