用户名: 密码: 验证码:
Graded Vector Representations of Immunoglobulins Produced in Response to West Nile Virus
详细信息    查看全文
文摘
Semantic vector models generate high-dimensional vector representations of words from their occurrence statistics across large corpora of electronic text. In these models, an occurrence of a word or number is treated as a discrete event, including numerical measurements of continuous properties. Furthermore, the sequence in which words occur is often ignored. In earlier work we have developed approaches to address these limitations, using graded demarcator vectors to represent measured distances in high-dimensional space. This permits incorporation of continuous properties, such as the position of a character within a term or a year of birth, into semantic vector models. In this paper we extend this work by developing a novel representational approach for protein sequences, in which both the positions and the properties of the amino acid components of protein sequences are represented using graded vectors. Evaluation on a set of around 100,000 immunoglobulin receptor sequences derived from subjects recently infected with West Nile Virus (WNV) suggests that encoding positions and properties using graded vectors increases the similarity between immunoglobulin receptor sequences produced by cells from ancestral lines known to have developed in response to WNV, relative to those from other cell lines.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700