3 years ago

# What does it mean for data to be observed' or missing'?.

John C. Galati

In statistical modelling of incomplete data, missingness is encoded as a relation between datasets $Y$ and response patterns $R$. We identify two different meanings of observed' and missing' implicit in this framework, only one of which is consistent with the definition formally encoded in $(Y, R)$. Notation that has been used in the literature for more than three decades fails to distinguish between these two concepts, rendering the notations $f(\mathbf{y}_{obs},\mathbf{y}_{mis}) and $f(\mathbf{y}_{mis} |\, \mathbf{y}_{obs})
conceptually contradictory. Additionally, the same notation $f(\mathbf{y}_{mis} |\, \mathbf{y}_{obs}) is used to refer to two densities with different domains. These densities can be considered to be equivalent mathematically, but conceptually they are not interchangeable as distributions because of their differing relationships to$(Y, R)$. Only one of these distributions is consistent with$(Y, R)$and standard conventions for interpretation of mathematical notation leads to the wrong choice conceptually for ignorable multiple imputation. We introduce formal definitions and notational improvements to treat these and other ambiguities, and we demonstrate their use through several example derivations. Publisher URL: http://arxiv.org/abs/1811.04161 DOI: arXiv:1811.04161v1 You might also like Discover & Discuss Important Research Keeping up-to-date with research can feel impossible, with papers being published faster than you'll ever be able to read them. That's where Researcher comes in: we're simplifying discovery and making important discussions happen. With over 19,000 sources, including peer-reviewed journals, preprints, blogs, universities, podcasts and Live events across 10 research areas, you'll never miss what's important to you. It's like social media, but better. Oh, and we should mention - it's free. Researcher displays publicly available abstracts and doesn’t host any full article content. If the content is open access, we will direct clicks from the abstracts to the publisher website and display the PDF copy on our platform. Clicks to view the full text will be directed to the publisher website, where only users with subscriptions or access through their institution are able to view the full article. and $f(\\mathbf{y}_{mis} |\\,\n\\mathbf{y}_{obs}); window.__REDUX_STATE__ = {"feed":{"scrollPos":0,"openAccess":false,"performRefetch":{}},"history":{"historyChanges":0},"onboarding":{"stepsList":[{"stepId":"type","stepName":"What kind of researcher are you?","stepDesc":"","options":[]},{"stepId":"Role","stepName":"What role describes you the best?","stepDesc":"","options":[]},{"stepId":"Org","stepName":"Where do you work or study?","stepDesc":""},{"stepId":"ra","stepName":"Research Areas","stepDesc":"Select the research areas you are interested in","options":[]},{"stepId":"topics","stepName":"Topics","stepDesc":"Select the topics you are interested in","options":[]},{"stepId":"publications","stepName":"Publications","stepDesc":"We have selected some popular publications for you to follow","options":[]},{"stepId":"feeds","stepName":"Feeds","stepDesc":"We have created this feed based on your interests, you can edit and add more from the side menu","options":[]}],"step":1,"loading":false,"loadingText":"Loading...","selections":[{"name":"type","selection":null,"type":"single","mandatory":true},{"name":"role","selection":null,"type":"single","mandatory":true},{"name":"work_study","selection":null,"type":"single","mandatory":false},{"name":"ra","selection":[],"type":"multiple","mandatory":true},{"name":"topics","selection":[],"type":"multiple","mandatory":true},{"name":"publications","selection":[],"type":"multiple","mandatory":false},{"name":"feeds","selection":[],"type":"multiple","mandatory":false}],"topicsNextCursor":null,"topicsFetchingNext":false}};
3 years ago

# What does it mean for data to be observed' or missing'?.

John C. Galati

In statistical modelling of incomplete data, missingness is encoded as a relation between datasets $Y$ and response patterns $R$. We identify two different meanings of observed' and missing' implicit in this framework, only one of which is consistent with the definition formally encoded in $(Y, R)$. Notation that has been used in the literature for more than three decades fails to distinguish between these two concepts, rendering the notations $f(\mathbf{y}_{obs},\mathbf{y}_{mis}) and $f(\mathbf{y}_{mis} |\, \mathbf{y}_{obs})
conceptually contradictory. Additionally, the same notation $f(\mathbf{y}_{mis} |\, \mathbf{y}_{obs}) is used to refer to two densities with different domains. These densities can be considered to be equivalent mathematically, but conceptually they are not interchangeable as distributions because of their differing relationships to$(Y, R)$. Only one of these distributions is consistent with$(Y, R)$and standard conventions for interpretation of mathematical notation leads to the wrong choice conceptually for ignorable multiple imputation. We introduce formal definitions and notational improvements to treat these and other ambiguities, and we demonstrate their use through several example derivations. Publisher URL: http://arxiv.org/abs/1811.04161 DOI: arXiv:1811.04161v1 You might also like Discover & Discuss Important Research Keeping up-to-date with research can feel impossible, with papers being published faster than you'll ever be able to read them. That's where Researcher comes in: we're simplifying discovery and making important discussions happen. With over 19,000 sources, including peer-reviewed journals, preprints, blogs, universities, podcasts and Live events across 10 research areas, you'll never miss what's important to you. It's like social media, but better. Oh, and we should mention - it's free. Researcher displays publicly available abstracts and doesn’t host any full article content. If the content is open access, we will direct clicks from the abstracts to the publisher website and display the PDF copy on our platform. Clicks to view the full text will be directed to the publisher website, where only users with subscriptions or access through their institution are able to view the full article. conceptually contradictory. Additionally, the same notation\n$f(\\mathbf{y}_{mis} |\\, \\mathbf{y}_{obs}); window.__REDUX_STATE__ = {"feed":{"scrollPos":0,"openAccess":false,"performRefetch":{}},"history":{"historyChanges":0},"onboarding":{"stepsList":[{"stepId":"type","stepName":"What kind of researcher are you?","stepDesc":"","options":[]},{"stepId":"Role","stepName":"What role describes you the best?","stepDesc":"","options":[]},{"stepId":"Org","stepName":"Where do you work or study?","stepDesc":""},{"stepId":"ra","stepName":"Research Areas","stepDesc":"Select the research areas you are interested in","options":[]},{"stepId":"topics","stepName":"Topics","stepDesc":"Select the topics you are interested in","options":[]},{"stepId":"publications","stepName":"Publications","stepDesc":"We have selected some popular publications for you to follow","options":[]},{"stepId":"feeds","stepName":"Feeds","stepDesc":"We have created this feed based on your interests, you can edit and add more from the side menu","options":[]}],"step":1,"loading":false,"loadingText":"Loading...","selections":[{"name":"type","selection":null,"type":"single","mandatory":true},{"name":"role","selection":null,"type":"single","mandatory":true},{"name":"work_study","selection":null,"type":"single","mandatory":false},{"name":"ra","selection":[],"type":"multiple","mandatory":true},{"name":"topics","selection":[],"type":"multiple","mandatory":true},{"name":"publications","selection":[],"type":"multiple","mandatory":false},{"name":"feeds","selection":[],"type":"multiple","mandatory":false}],"topicsNextCursor":null,"topicsFetchingNext":false}};
3 years ago

# What does it mean for data to be observed' or missing'?.

John C. Galati

In statistical modelling of incomplete data, missingness is encoded as a relation between datasets $Y$ and response patterns $R$. We identify two different meanings of observed' and missing' implicit in this framework, only one of which is consistent with the definition formally encoded in $(Y, R)$. Notation that has been used in the literature for more than three decades fails to distinguish between these two concepts, rendering the notations $f(\mathbf{y}_{obs},\mathbf{y}_{mis}) and $f(\mathbf{y}_{mis} |\, \mathbf{y}_{obs})
conceptually contradictory. Additionally, the same notation $f(\mathbf{y}_{mis} |\, \mathbf{y}_{obs}) is used to refer to two densities with different domains. These densities can be considered to be equivalent mathematically, but conceptually they are not interchangeable as distributions because of their differing relationships to$(Y, R)$. Only one of these distributions is consistent with$(Y, R)$and standard conventions for interpretation of mathematical notation leads to the wrong choice conceptually for ignorable multiple imputation. We introduce formal definitions and notational improvements to treat these and other ambiguities, and we demonstrate their use through several example derivations. Publisher URL: http://arxiv.org/abs/1811.04161 DOI: arXiv:1811.04161v1 You might also like Discover & Discuss Important Research Keeping up-to-date with research can feel impossible, with papers being published faster than you'll ever be able to read them. That's where Researcher comes in: we're simplifying discovery and making important discussions happen. With over 19,000 sources, including peer-reviewed journals, preprints, blogs, universities, podcasts and Live events across 10 research areas, you'll never miss what's important to you. It's like social media, but better. Oh, and we should mention - it's free. Researcher displays publicly available abstracts and doesn’t host any full article content. If the content is open access, we will direct clicks from the abstracts to the publisher website and display the PDF copy on our platform. Clicks to view the full text will be directed to the publisher website, where only users with subscriptions or access through their institution are able to view the full article. is used to refer to two densities\nwith different domains. These densities can be considered to be equivalent\nmathematically, but conceptually they are not interchangeable as distributions\nbecause of their differing relationships to$(Y, R)$. Only one of these\ndistributions is consistent with$(Y, R)$and standard conventions for\ninterpretation of mathematical notation leads to the wrong choice conceptually\nfor ignorable multiple imputation. We introduce formal definitions and\nnotational improvements to treat these and other ambiguities, and we\ndemonstrate their use through several example derivations.\n ","htmlAbstract":null,"createdDate":"2019-10-04T14:11:22.000Z","media":null,"settings":null,"pdfUrl":"https://arxiv.org/pdf/1811.04161","openAccessUrl":null,"authors":{"type":"json","json":["John C. Galati"]},"contentType":null,"actionButton":null,"journal":{"type":"id","generated":false,"id":"Journal:3624","typename":"Journal"},"headlineImage":null,"__typename":"V4Paper","doi":"arXiv:1811.04161v1","paperUrl":"http://arxiv.org/abs/1811.04161","metrics":{"type":"id","generated":true,"id":"$V4Paper:1618469.metrics","typename":"Metrics"}},"Journal:3624":{"id":"3624","name":"arXiv Methodology","cover":{"type":"id","generated":true,"id":"$Journal:3624.cover","typename":"Image"},"__typename":"Journal"},"$Journal:3624.cover":{"baseURL":"https://s3-eu-west-1.amazonaws.com/stackademic/assets/journal_cover_arxviv.png","__typename":"Image"},"$V4Paper:1618469.metrics":{"selectedCount":null,"viewedCount":1,"bookmarkedCount":null,"__typename":"Metrics"}}; window.__REDUX_STATE__ = {"feed":{"scrollPos":0,"openAccess":false,"performRefetch":{}},"history":{"historyChanges":0},"onboarding":{"stepsList":[{"stepId":"type","stepName":"What kind of researcher are you?","stepDesc":"","options":[]},{"stepId":"Role","stepName":"What role describes you the best?","stepDesc":"","options":[]},{"stepId":"Org","stepName":"Where do you work or study?","stepDesc":""},{"stepId":"ra","stepName":"Research Areas","stepDesc":"Select the research areas you are interested in","options":[]},{"stepId":"topics","stepName":"Topics","stepDesc":"Select the topics you are interested in","options":[]},{"stepId":"publications","stepName":"Publications","stepDesc":"We have selected some popular publications for you to follow","options":[]},{"stepId":"feeds","stepName":"Feeds","stepDesc":"We have created this feed based on your interests, you can edit and add more from the side menu","options":[]}],"step":1,"loading":false,"loadingText":"Loading...","selections":[{"name":"type","selection":null,"type":"single","mandatory":true},{"name":"role","selection":null,"type":"single","mandatory":true},{"name":"work_study","selection":null,"type":"single","mandatory":false},{"name":"ra","selection":[],"type":"multiple","mandatory":true},{"name":"topics","selection":[],"type":"multiple","mandatory":true},{"name":"publications","selection":[],"type":"multiple","mandatory":false},{"name":"feeds","selection":[],"type":"multiple","mandatory":false}],"topicsNextCursor":null,"topicsFetchingNext":false}}; 3 years ago # What does it mean for data to be observed' or missing'?. John C. Galati In statistical modelling of incomplete data, missingness is encoded as a relation between datasets$Y$and response patterns$R$. We identify two different meanings of observed' and missing' implicit in this framework, only one of which is consistent with the definition formally encoded in$(Y, R)$. Notation that has been used in the literature for more than three decades fails to distinguish between these two concepts, rendering the notations $f(\mathbf{y}_{obs},\mathbf{y}_{mis})

and $f(\mathbf{y}_{mis} |\, \mathbf{y}_{obs}) conceptually contradictory. Additionally, the same notation $f(\mathbf{y}_{mis} |\, \mathbf{y}_{obs})
is used to refer to two densities with different domains. These densities can be considered to be equivalent mathematically, but conceptually they are not interchangeable as distributions because of their differing relationships to $(Y, R)$. Only one of these distributions is consistent with $(Y, R)$ and standard conventions for interpretation of mathematical notation leads to the wrong choice conceptually for ignorable multiple imputation. We introduce formal definitions and notational improvements to treat these and other ambiguities, and we demonstrate their use through several example derivations.

Publisher URL: http://arxiv.org/abs/1811.04161

DOI: arXiv:1811.04161v1

You might also like
Discover & Discuss Important Research

Keeping up-to-date with research can feel impossible, with papers being published faster than you'll ever be able to read them. That's where Researcher comes in: we're simplifying discovery and making important discussions happen. With over 19,000 sources, including peer-reviewed journals, preprints, blogs, universities, podcasts and Live events across 10 research areas, you'll never miss what's important to you. It's like social media, but better. Oh, and we should mention - it's free.

Researcher displays publicly available abstracts and doesn’t host any full article content. If the content is open access, we will direct clicks from the abstracts to the publisher website and display the PDF copy on our platform. Clicks to view the full text will be directed to the publisher website, where only users with subscriptions or access through their institution are able to view the full article.