The following are test cases for how screen readers handle span vs div content, as well as inline vs block CSS. I tested with VoiceOver and the results are shown below every test case.
<div>
<span>Span Text</span>
<div>Div Text</div>
</div>
Result: read as two paragraphs - "Span Text (pause) Div Text"
<div>
<div>
<span>Span Text</span>
Div Text
</div>
</div>
Result: read as one paragraph - "Span Text Div Text"
<div>
<span>Span Text</span>
<div style="display:inline">Div Text</div>
</div>
Result: read as one paragraph - "Span Text Div Text"
So, I think I understand why test 1 and test 2 do what they do. A div is considered as 'group' element, while a span is not a 'group' element. Before the Screen Reader reads a new group element, it knows to pause (for example, before starting a new paragraph). You can also see this when using the Accessibility Inspector in MacOS.
However, when it comes to test 3, I was a little surprised. CSS can have an affect on screen reader output, but I haven't seen anything in the standards about the CSS 'display' property.
This leads me to two questions: