Participants in an asynchronous conversation (e.g., forums, emails) interact with each other at different times, performing certain communicative acts, called speech acts (e.g., question, request). In this article, we propose a hybrid approach to speech act recognition in asynchronous conversations. Our approach works in two steps: a long short-term memory (LSTM) recurrent neural network first encodes each sentence separately into a distribution representation, which are then used in a conditional structured model to capture the conversational dependencies between sentences. The structured model can consider arbitrary graph structures to model conversational dependencies in an asynchronous conversation. In addition, to mitigate the problem of limited annotated data in the asynchronous domains, we adapt the recurrent model to learn from synchronous conversations (e.g., meetings) using adversarial training of neural networks. Empirical evaluation shows the effectiveness of our approach over existing ones: (i) LSTMs provide better task-specific representations, (ii) the global structured model improves over local models, and (iii) adversarial training gives better domain-invariant representations.