Skip to main content

BigNN: An open-source big data toolkit focused on biomedical sentence classification

Publication ,  Conference
Tafti, AP; Behravesh, E; Assefi, M; Larose, E; Badger, J; Mayer, J; Doan, A; Page, D; Peissig, P
Published in: Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017
July 1, 2017

Every single day, a massive amount of text data is generated by different medical data sources, such as scientific literature, medical web pages, health-related social media, clinical notes, and drug reviews. Processing this wealth of data is indeed a daunting task, and it forces us to adopt smart and scalable computational strategies, including machine intelligence, big data analytics, and distributed architecture. In this contribution, we designed and developed an open-source big data neural network toolkit, namely bigNN which tackles the problem of large-scale biomedical text classification in an efficient fashion, facilitating fast prototyping and reproducible text analytics researches. bigNN scales up a word2vec-based neural network model over Apache Spark 2.10 and Hadoop Distributed File System (HDFS) 2.7.3, allowing for more efficient big data sentence classification. The toolkit supports big data computing, and simplifies rapid application development in sentence analysis by allowing users to configure and examine different internal parameters of both Apache Spark and the neural network model. bigNN is fully documented, and it is publicly and freely available at https://github.com/bircatmcri/bigNN.

Duke Scholars

Published In

Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017

DOI

ISBN

9781538627143

Publication Date

July 1, 2017

Volume

2018-January

Start / End Page

3888 / 3896
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Tafti, A. P., Behravesh, E., Assefi, M., Larose, E., Badger, J., Mayer, J., … Peissig, P. (2017). BigNN: An open-source big data toolkit focused on biomedical sentence classification. In Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017 (Vol. 2018-January, pp. 3888–3896). https://doi.org/10.1109/BigData.2017.8258394
Tafti, A. P., E. Behravesh, M. Assefi, E. Larose, J. Badger, J. Mayer, A. Doan, D. Page, and P. Peissig. “BigNN: An open-source big data toolkit focused on biomedical sentence classification.” In Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017, 2018-January:3888–96, 2017. https://doi.org/10.1109/BigData.2017.8258394.
Tafti AP, Behravesh E, Assefi M, Larose E, Badger J, Mayer J, et al. BigNN: An open-source big data toolkit focused on biomedical sentence classification. In: Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017. 2017. p. 3888–96.
Tafti, A. P., et al. “BigNN: An open-source big data toolkit focused on biomedical sentence classification.” Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017, vol. 2018-January, 2017, pp. 3888–96. Scopus, doi:10.1109/BigData.2017.8258394.
Tafti AP, Behravesh E, Assefi M, Larose E, Badger J, Mayer J, Doan A, Page D, Peissig P. BigNN: An open-source big data toolkit focused on biomedical sentence classification. Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017. 2017. p. 3888–3896.

Published In

Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017

DOI

ISBN

9781538627143

Publication Date

July 1, 2017

Volume

2018-January

Start / End Page

3888 / 3896