RECENT PUBLICATIONS

Complex Data (Structured Data)

paper thumbnail

MapIQ: Benchmarking Multimodal Large Language Models for Map Question Answering

V Srivastava, F Lei, S Mukhopadhyay, V Gupta, R Maciejewski COLM 2025

Recent advancements in multimodal large language models (MLLMs) have driven researchers to explore how well these models read data visualizations, e.g., bar charts, scatter plots. More recently, attention has shifted to visual question answering with maps (Map-VQA). However, Map-VQA research has primarily focused on choropleth maps, which cover only a limited range of thematic categories and visual analytical tasks. To address these gaps, we introduce MapIQ, a benchmark dataset comprising 14,706 question-answer pairs across three map types—choropleth maps, cartograms, and proportional symbol maps spanning topics from six distinct themes (e.g., housing, crime). We evaluate multiple MLLMs using six visual analytical tasks, comparing their performance against one another and a human baseline. An additional experiment...

paper thumbnail

TabXEval: Why this is a Bad Table? An eXhaustive Rubric for Table Evaluation

Vihang Pancholi, Jainit Sushil Bafna, Tejas Anvekar, Manish Shrivastava, Vivek Gupta ACL 2025

Evaluating tables qualitatively and quantitatively poses a significant challenge, as standard metrics often overlook subtle structural and content-level discrepancies. To address this, we propose a rubric-based evaluation frame work that integrates multi-level structural descriptors with fine-grained contextual signals, enabling more precise and consistent table comparison. Building on this, we introduce TabXEval, an eXhaustive and eXplainable two-phase evaluation framework. TabXEval first aligns reference and predicted...

paper thumbnail

Map&Make: Schema Guided Text to Table Generation

Naman Ahuja, Fenil Bardoliya, Chitta Baral, Vivek Gupta ACL 2025

Transforming dense, unstructured text into interpretable tables—commonly referred to as Text-to-Table generation—is a key task in information extraction. Existing methods often overlook what complex information to extract and how to infer it from text. We present Map&Make, a versatile approach that decomposes text into atomic propositions to infer latent schemas, which are then used to generate tables capturing both qualitative nuances and quantitative facts. We evaluate...

paper thumbnail

PRAISE: Enhancing Product Descriptions with LLM-Driven Structured Insights

Adnan Qidwai, Srija Mukhopadhyay, Prerana Khatiwada, Dan Roth, Vivek Gupta ACL 2025

Accurate and complete product descriptions are crucial for e-commerce, yet seller-provided information often falls short. Customer reviews offer valuable details but are laborious to sift through manually. We present PRAISE: Product Review Attribute Insight Structuring Engine, a novel system that uses Large Language Models (LLMs) to automatically extract, compare, and structure insights from customer reviews and seller descriptions. PRAISE provides users with an intuitive interface to identify missing, contradictory, or partially matching details between these two sources, presenting the discrepancies in a clear, structured format alongside supporting evidence from reviews. This allows...

paper thumbnail

LLM-Symbolic Integration for Robust Temporal Tabular Reasoning

Atharv Kulkarni, Kushagra Dixit, Vivek Srikumar, Dan Roth, Vivek Gupta ACL 2025

Temporal tabular question answering presents a significant challenge for Large Language Models (LLMs), requiring robust reasoning over structured data—a task where traditional prompting methods often fall short. These methods face challenges such as memorization, sensitivity to table size, and reduced performance on complex queries. To overcome these limitations, we introduce TEMPTABQA-C, a synthetic dataset designed for systematic and controlled evaluations, alongside a symbolic intermediate representation that transforms tables into database schemas. This structured approach allows LLMs to generate and execute SQL queries, enhancing generalization and mitigating biases. By incorporating...

paper thumbnail

GETReason: Enhancing Image Context Extraction through Hierarchical Multi-Agent Reasoning

Shikhhar Siingh, Abhinav Rawat, Chitta Baral, Vivek Gupta ACL 2025

Publicly significant images from events carry valuable contextual information with applications in domains such as journalism and education. However, existing methodologies of ten struggle to accurately extract this contextual relevance from images. To address this challenge, we introduce GETREASON(Geospatial Event Temporal Reasoning), a framework designed to go beyond surface level image descriptions and infer deeper con textual meaning. We hypothesize that extracting global event, temporal, and geospatial information from an image enables a more accurate understanding of its contextual significance. We also introduce a new metric GREAT (Geospatial, Reasoning and Event Accuracy with Temporal alignment) for a reasoning capturing evaluation. Our layered multi-agentic approach, evaluated...

paper thumbnail

MAPWise: Evaluating Vision-Language Models for Advanced Map Queries

Srija Mukhopadhyay, Abhishek Rajgaria, Prerana Khatiwada, Manish Shrivastava, Dan Roth, Vivek Gupta NAACL 2025

Vision-language models (VLMs) excel at tasks requiring joint understanding of visual and linguistic information. A particularly promising yet under-explored application for these models lies in answering questions based on various kinds of maps. This study investigates the efficacy of VLMs in answering questions based on choropleth maps, which are widely used for data analysis and representation. To facilitate and encourage research in this area, we introduce a novel map-based question-answering benchmark, consisting of maps from three geographical regions (United States, India, China), each containing 1000 questions. Our benchmark...

paper thumbnail

Leveraging LLM for Synchronizing Information Across Multilingual Tables

Siddharth Khincha, Tushar Kataria, Ankita Anand, Dan Roth, Vivek Gupta NAACL 2025

The vast amount of online information today poses challenges for non-English speakers, as much of it is concentrated in high-resource languages such as English and French. Wikipedia reflects this imbalance, with content in low-resource languages frequently outdated or incomplete. Recent research has sought to improve cross-language synchronization of Wikipedia tables using rule-based methods. These approaches can be effective, but they struggle with complexity and generalization. This paper explores large language models (LLMs) for multilingual information synchronization, using zero-shot prompting as a scalable solution. We introduce the Information Updation dataset, simulating the real-world process of updating outdated Wikipedia tables, and evaluate LLM performance. Our findings...

paper thumbnail

TRANSIENT TABLES: Evaluating LLMs’ Reasoning on Temporally Evolving Semi-structured Tables

Abhilash Shankarampeta, Harsh Mahajan, Tushar Kataria, Dan Roth, Vivek Gupta NAACL 2025

Humans continuously make new discoveries, and understanding temporal sequence of events leading to these breakthroughs is essential for advancing science and society. This ability to reason over time allows us to identify future steps and understand the effects of financial and political decisions on our lives. However, large language models (LLMs) are typically trained on static datasets, limiting their ability to perform effective temporal reasoning. To assess the temporal reasoning capabilities of LLMs, we present the TRANSIENTTABLES dataset, which comprises 3,971 questions derived from over 14,000 tables, spanning 1,238 entities across multiple time periods. We introduce a template-based question-generation pipeline that harnesses LLMs to refine both templates and questions. Additionally, we establish baseline...

paper thumbnail

H-STAR: LLM-driven Hybrid SQL-Text Adaptive Reasoning on Tables

Nikhil Abhyankar, Vivek Gupta, Dan Roth, Chandan Reddy NAACL 2025

Tabular reasoning involves interpreting natural language queries about tabular data, which presents a unique challenge of combining language understanding with structured data analysis. Existing methods employ either textual reasoning, which excels in semantic interpretation but struggles with mathematical operations, or symbolic reasoning, which handles computations well but lacks semantic understanding. This paper introduces a novel algorithm H-STAR that integrates both symbolic and semantic (textual) approaches in a two-stage process to address these limitations. H-STAR employs: (1) step-wise table extraction using ‘multi-view’ column retrieval followed by row extraction, and (2) adaptive reasoning that adapts reasoning strategies based on question types, utilizing semantic reasoning for direct lookup and complex lexical queries while augmenting textual reasoning with symbolic reasoning support for quantitative and logical tasks. Our extensive...

paper thumbnail

Enhancing Temporal Understanding in LLMs for Semi-structured Tables

Irwin Deng, Kushagra Dixit, Dan Roth, Vivek Gupta NAACL 2025 (Findings)

Temporal reasoning over tabular data presents substantial challenges for large language models (LLMs), as evidenced by recent research. In this study, we conduct a comprehensive analysis of temporal datasets to pinpoint the specific limitations of LLMs. Our investigation leads to enhancements in TempTabQA, a benchmark specifically designed for tabular temporal question answering. We provide critical insights for enhancing LLM performance in temporal reasoning tasks with tabular data. Furthermore, we introduce a novel approach, C.L.E.A.R to strengthen LLM capabilities in this domain. Our findings demonstrate that our method im proves evidence-based reasoning across various models. Additionally, our experimental...

paper thumbnail

NTSEBENCH: Cognitive Reasoning Benchmark for Vision Language Models

Pranshu Pandya, Vatsal Gupta, Agney S Talwarr, Tushar Kataria, Dan Roth, Vivek Gupta NAACL 2025

Cognitive textual and visual reasoning tasks, including puzzles, series, and analogies, demand the ability to quickly reason, decipher, and evaluate patterns both textually and spatially. Due to extensive training on vast amounts of human-curated data, large language models (LLMs) and vision language models (VLMs) excel in common-sense reasoning tasks, but still struggle with more complex reasoning that demands deeper cognitive understanding. We introduce NTSEBENCH, a new dataset designed to evaluate cognitive multimodal reasoning and problem-solving skills of large models. The dataset contains 2,728 multiple-choice questions, accompanied by a total of 4,642 images, spanning 26 categories....

paper thumbnail

Evaluating Concurrent Robustness of Language Models Across Diverse Challenge Sets

Vatsal Gupta, Pranshu Pandya, Tushar Kataria, Vivek Gupta, Dan Roth
EMNLP 2024

Language models, characterized by their black box nature, often hallucinate and display sensitivity to input perturbations, causing concerns about trust. To enhance trust, it is imperative to gain a comprehensive understanding of the model’s failure modes and develop effective strategies to improve their performance. In this study, we introduce a methodology designed to examine how input perturbations affect language models across various scales, including pre-trained models and large language models (LLMs). Utilizing fine-tuning, we enhance the model’s robustness to input perturbations. Additionally, ...

paper thumbnail

Unraveling the Truth: Do VLMs really Understand Charts? A Deep Dive into Consistency and Robustness


Srija Mukhopadhyay, Adnan Qidwai, Aparna Garimella, Pritika Ramu, Vivek Gupta, Dan Roth
EMNLP 2024 (Findings)

Chart question answering (CQA) is a crucial area of Visual Language Understanding. However, the robustness and consistency of current Visual Language Models (VLMs) in this field remain under-explored. This paper evaluates state-of-the-art VLMs on comprehensive datasets, developed specifically for this study, encompassing diverse question categories and chart formats. We investigate two key aspects: 1) the models' ability to handle varying levels of chart and question complexity, and 2) their robustness across different visual representations of the same underlying data. Our analysis reveals significant performance variations based on question and chart types, highlighting both strengths and weaknesses of current models. Additionally,...

paper thumbnail

Knowledge-Aware Reasoning over Multimodal Semi-structured Tables

Suyash Vardhan Mathur, Jainit Sushil Bafna, Kunal Kartik, Harshita Khandelwal, Manish Shrivastava, Vivek Gupta, Mohit Bansal, Dan Roth EMNLP 2024 (Findings)

Existing datasets for tabular question answering typically focus exclusively on text within cells. However, real-world data is inherently multimodal, often blending images such as symbols, faces, icons, patterns, and charts with textual content in tables. With the evolution of AI models capable of multimodal reasoning, it is pertinent to assess their efficacy in handling such structured data. This study investigates whether current AI models can perform knowledge-aware reasoning on multimodal structured data. We explore their ability to reason on tables that integrate both images and text, introducing MMTABQA, a new dataset designed for this purpose. Our experiments...

paper thumbnail

FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts

Shubhankar Singh, Purvi Chaurasia, Yerram Varun, Pranshu Pandya, Vatsal Gupta, Vivek Gupta, Dan Roth ACL 2024 (Findings)

Existing benchmarks for visual question answering lack in visual grounding and complexity, particularly in evaluating spatial reasoning skills. We introduce FlowVQA, a novel benchmark aimed at assessing the capabilities of visual question-answering multimodal language models in reasoning with flowcharts as visual contexts. FlowVQA comprises 2,272 carefully generated and human-verified flowchart images from three distinct content sources, along with 22,413 diverse question-answer pairs, to test a spectrum of reasoning tasks, including information localization, decision-making, and logical progression. We conduct a thorough baseline evaluation on a suite of both open-source and proprietary multimodal language models using various strategies, followed by an analysis of directional bias. The results...

paper thumbnail

Evaluating LLMs’ Mathematical Reasoning in Financial Document Question Answering

Pragya Srivastava, Manuj Malik, Vivek Gupta, Tanuja Ganu, Dan Roth ACL 2024 (Findings)

Large Language Models (LLMs), excel in natural language understanding, but their capability for complex mathematical reasoning with a hybrid of structured tables and unstructured text remain uncertain. This study explores LLMs’ mathematical reasoning on four financial tabular question-answering datasets: TATQA, FinQA, ConvFinQA, and Multihiertt. Through extensive experiments with various models and prompting techniques, we assess how LLMs adapt to complex tables and mathematical tasks. We focus on sensitivity to table complexity and performance variations with an increasing number of arithmetic reasoning steps. The results...

paper thumbnail

ChartCheck: An Evidence-Based Fact-Checking Dataset over Real-World Chart Images

Mubashara Akhtar, Nikesh Subedi, Vivek Gupta, Sahar Tahmasebi, Oana Cocarascu, Elena Simperl ACL 2024 (Findings)

Whilst fact verification has attracted substantial interest in the natural language processing community, verifying misinforming statements against data visualizations such as charts has so far been overlooked. Charts are commonly used in the real-world to summarize and communicate key information, but they can also be easily misused to spread misinformation and promote certain agendas. In this paper, we introduce ChartCheck, a novel, large-scale dataset for explainable fact-checking against Chart: Evidence real-world charts, consisting of 1.7k charts and 10.5k human-written claims and explanations. We systematically...

paper thumbnail

Enhancing Question Answering on Charts Through Effective Pre-training Tasks

Ashim Gupta, Vivek Gupta, Shuo Zhang, Yujie He, Ning Zhang, Shalin Shah BlackboxNLP 2024 (Findings)

To completely understand a document, the use of textual information is not enough. Under standing visual cues, such as layouts and charts, is also required. While the current state-of the-art approaches for document understanding (both OCR-based and OCR-free) work well, we have not found any other works conducting a thorough analysis of their capabilities and limitations. Therefore, in this work, we address the limitation of current VisualQA models when applied to charts and plots. To investigate shortcomings of the state-of-the-art models, we conduct a comprehensive behavioral analysis, using ChartQA as a case study. Our findings ...


PAST PUBLICATIONS

  • TempTabQA: Temporal Question Answering for Semi-Structured Tables
    Vivek Gupta, Pranshu Kandoi, Mahek Bhavesh Vora, Shuo Zhang, Yujie He, Ridho Reinanda, Vivek Srikumar
    Published at EMNLP 2023, Paper, Project Page
  • Exploring the Numerical Reasoning Capabilities of Language Models: A Comprehensive Analysis on Tabular Data
    Mubashara Akhtar, Abhilash Shankarampeta, Vivek Gupta, Arpit Patil, Oana Cocarascu, Elena Simperl
    Published at EMNLP 2023(Findings), Paper
  • InfoSync: Information Synchronization across Multilingual Semi-structured Tables
    Sidharth Khincha, Chelsi Jain, Vivek Gupta, Tushar Kataria, Shuo Zhang
    Published at ACL 2023, presented at Matching@ACL 2023 Project Page, Paper, Video, Poster, PPT
  • Right for the Right Reason: Evidence Extraction for Trustworthy Tabular Reasoning,
    Vivek Gupta, Shuo Zhang, Alakananda Vempala, Yujie He, Temma Choji, Vivek Srikumar
    published at ACL 2022 Paper Poster [PPT] Video Media LinkedIn
  • Is My Model Using The Right Evidence? Systematic Probes for Examining Evidence-Based Tabular Reasoning,
    Vivek Gupta, Riyaz A. Bhat, Atreya Ghosal, Manish Srivastava, Maneesh Singh, Vivek Srikumar
    published at TACL 2022, presented at ACL 2022 [Paper][Preprint] [Poster] [PPT] [Video]
  • Bilingual Tabular Inference: A Case Study on Indic Languages
    Chaitanya Agarwal*, Vivek Gupta*, Anoop Kunchukuttan, Manish Shrivastava
    published at NAACL 2022 [Paper] [Preprint] [PPT] [Poster] [Video]
  • Trans-KBLSTM: An External Knowledge Enhanced Transformer BiLSTM model for Tabular Reasoning,
    Yerram Varun*, Aayush Sharma*, Vivek Gupta*
    to appear at DeeLIO-2022 @ACL 2022 [Paper] [Preprint] [Poster] [PPT] [Video]
    Won Best Paper award at DeeLIO-2022
  • XInfoTabS: Evaluating Multilingual Tabular Natural Language Inference,
    Bhavnick Minhas*, Anant Shankhdhar*, Vivek Gupta*, Divyanshu Aggarwal, Shuo Zhang,
    published at MML-2022 (non-archival) and FEVER-2022 (archival) @ACL 2022 [Preprint] [Poster] [PPT] [Video] [Media] [LinkedIn]
  • Enhancing Tabular Reasoning with Pattern Exploiting Training,
    Abhilash Shankarampeta*, Vivek Gupta*, Shuo Zhang
    to appear at SUKI-2022 (non-archival) [Preprint] [PPT] [Poster] [Video]
    (Extended Version at AACL 2022) [Paper] [Project Page] [Media]
  • Efficient Realistic Data Generation Framework for Semi-Structured Tabular Inference,
    Dibyakanti Kumar*, Vivek Gupta*, Soumya Sharma, Shuo Zhang
    to appear at SUKI-2022(non-archival) [Preprint] [PPT] [Video] [Poster]
    (Extended Version at EMNLP 2022) [Project Page] [Paper] [Media]
  • Leveraging Data Recasting to Enhance Tabular Reasoning,
    Aashna Jena*, Vivek Gupta*, Manish Shrivastava, Julian Martin Eisenschlos
    to appear at SUKI-2022 (non-archival) [Preprint] [Poster] [PPT] [Video]
    (Extended Version at EMNLP 2022) [Project Page] [Paper] [Media] [Poster]
  • RetroNLU: Retrieval Augmented Task Oriented Semantic Parsing,
    Vivek Gupta, Akshat Shrivastava, Adithya Sagar, Armen Aghajanyan, Denis Savenkov,
    to appear at Spa-NLP-2022 (non-archival) and NLP4ConvAI-2022 (archival) @ACL 2022 [Paper] [Preprint] [Poster] [PPT] [Video]
    Won Outstanding Paper award at NLP4ConvAI-2022
  • TabPert: An Effective Platform for Tabular Perturbation,
    Nupur Jain,Vivek Gupta, Anshul Rai, Gaurav Kumar
    Published at EMNLP 2021, Demo track [Paper] [Project Page][Preprint] [PPT] [Video] [Code]
  • Incorporating External Knowledge to Enhance Tabular Reasoning,
    J. Neeraja*, Vivek Gupta*, and Vivek Srikumar
    Published at NAACL 2021 [Paper] [Project Page] [Code] [Video] [Poster] [PPT]
  • InfoTabS: Inference on Tables as Semi-structured Data,
    Vivek Gupta, Maitrey Mehta, Pegah Nokhiz, Vivek Srikumar
    Published at ACL 2020 [Paper] [Project Page] [Video] [Data] [Code]
  • IndicSemParse: Evaluating Inter-Bilingual Semantic Parsing for Indian Languages
    Divyanshu Aggarwal*, Vivek Gupta*, Anoop Kunchukuttan
    To appear at NLP4ConvAI 2023 Project Page, Preprint
  • IndicXNLI: Evaluating Multilingual Inference for Indian Languages
    Divyanshu Aggarwal*, Vivek Gupta*, Anoop Kunchukuttan
    To appear at MIA-2022 (non-archival) Preprint
  • Logic Driven Classification for Low Resource Settings
    Shagun Uppal, Vivek Gupta, Avinash Swaminathan, Debanjan Mahata, Rakesh Gosangi, Haimin Zhang, Rajiv Ratn Shah, Amanda Stent
    Published at AACL-IJCNLP 2020 Paper
  • A Logic-Driven Framework for Consistency of Neural Models
    Tao Li, Vivek Gupta, Maitrey Mehta, and Vivek Srikumar
    Published at EMNLP-IJCNLP 2019 Paper
  • Unbiasing Review Ratings with Tendency-based Collaborative Filtering
    Pranshi Yadav*, Priya Yadav*, Pegah Nokhiz, Vivek Gupta
    Published at AACL-IJCNLP SRW 2020 Paper
  • User Bias Removal in Review Score Prediction
    Rahul Wadbude, Vivek Gupta, Dheeraj Mekala, Harish Karnick
    Published at CoDS-COMAD 2018 and DAB@CIKM 2017 Paper
  • Equalizing Recourse across Groups
    Vivek Gupta*, Pegah Nokhiz*, Chitradeep Dutta Roy*, Suresh Venkatasubramanian
    Technical Report. Preprint
  • Efficient Estimation of Generalization Error and Bias-Variance Components of Ensembles
    Dhruv Mahajan, Vivek Gupta, Satya Keerthi, Sundararjan Sellamanickam
    Technical Report. [Preprint]
  • Unsupervised Contextualized Document Representation,
    Ankur Gupta, Vivek Gupta
    Published at SustaiNLP 2021 at EMNLP 2021 workshop. [Paper] [Preprint] [PPT] [Poster] [Video] [Code]
  • Improving Document Classification with Multi-Sense Embeddings,
    Vivek Gupta, Ankit Saw, Pegah Nokhiz, Harshit Gupta, and Partha Talukdar
    Published at ECAI 2020 [Paper] [Blog] [Video] [Code]
    (extention of NAACL-SRW 2019 work)
  • P-SIF: Document Embeddings using Partition Averaging
    Vivek Gupta, Ankit Saw, Pegah Nokhiz, Praneeth Netrapalli, Piyush Rai, Partha Talukdar
    Published at AAAI 2020, Presented at SustaiNLP 2020 [Paper] [Appendix] [PPT] [Poster] [Code] [Blog]
  • Word Polysemy Aware Document Vector Estimation
    Vivek Gupta, Ankit Saw, Harshit Gupta, Pegah Nokhiz and Partha Talukdar
    Presented at NAACL-SRW 2019 (non-archival)
    (extended version appear at ECAI 2020) [Code]
  • Sparse Composite Document Vectors using soft clustering over distributional representations
    Dheeraj Mekala*, Vivek Gupta*, Bhargavi Paranjape , Harish Karnick
    Published at EMNLP 2017. [Paper] [PPT] [Video] [Code]
  • On Dimensional Linguistic Properties of the Word Embedding Space
    Vikas Raunak*, Vaibhav Kumar*, Vivek Gupta and Florian Metze
    Presented at ACL-SRW 2019 (non-archival), Published at RepL4NLP 2020 [Paper] [Paper] [Code]
  • Effective Dimensionality Reduction for Word Embeddings
    Vikas Raunak, Vivek Gupta and Florian Metze
    Published at RepL4NLP 2019. [Paper] [Poster] [Code]
  • SumPubMed: Summarization Dataset of PubMed Scientific Articles
    Vivek Gupta, Prerna Bharti, Pegah Nokhiz, Harish Karnick
    Accepted to appear in ACL-IJCNLP SRW 2021 Preprint Dataset PPT
  • Unsupervised Semantic Abstractive Summarization
    Shibhansh Dohare, Vivek Gupta, Harish Karnick
    Published at ACL-SRW 2018 Preprint Paper
  • Distributional Semantics meet Multi-Label Learning
    Vivek Gupta, Rahul Wadbude, Nagararjan Natararjan, Harish Karnick, Prateek Jain, Piyush Rai
    Published at AAAI 2019 Paper
  • Bayes-optimal Hierarchical Classification over Asymmetric Tree-Distance Loss
    Dheeraj Mekala, Vivek Gupta, Purushottam Kar, Harish Karnick
    Technical Report. Report
  • On Long-Tailed Phenomena in Neural Machine Translation,
    Vikas Raunak, Siddharth Dalmia, Vivek Gupta, and Florian Metze
    Published at EMNLP 2020 (Findings), Presented at SPNLP2020 [Paper] [Code]
  • Product Classification in E-Commerce using Distributional Semantics
    Vivek Gupta, Harish Karnick, Ashendra Bansal, Pradhuman Jhala
    Published at COLING 2016 Paper
  • Assisting Humans to Achieve Optimal Sleep by Changing Ambient Temperature
    Vivek Gupta*, Siddhant Mittal*, Sandip Bhaumik, Raj Roy
    Published at BIBM 2016 Paper