文摘
We present a framework for semantic situation understanding and interpretation of multimodal data using Description Logics (DL) and rules. More precisely, we use DL models to formally describe contextualised dependencies among verbal and non-verbal descriptors in multimodal natural language interfaces, while context aggregation, fusion and interpretation is supported by SPARQL rules. Both background knowledge and multimodal data, e.g. language analysis results, facial expressions and gestures recognized from multimedia streams, are captured in terms of OWL 2 ontology axioms, the de facto standard formalism of DL models on the Web, fostering reusability, adaptability and interoperability of the framework. The framework has been applied in the eminent field of healthcare, providing the models for the semantic enrichment and fusion of verbal and non-verbal descriptors in dialogue-based systems.