Classification of multi-genomic data using MapReduce paradigm

Please use this identifier to cite or link to this item: https://idr.l2.nitk.ac.in/jspui/handle/123456789/7957

Full metadata record

DC Field	Value	Language
dc.contributor.author	Pahadia, M.
dc.contributor.author	Srivastava, A.
dc.contributor.author	Srivastava, D.
dc.contributor.author	Patil, N.
dc.date.accessioned	2020-03-30T10:03:11Z	-
dc.date.available	2020-03-30T10:03:11Z	-
dc.date.issued	2015
dc.identifier.citation	International Conference on Computing, Communication and Automation, ICCCA 2015, 2015, Vol., , pp.678-682	en_US
dc.identifier.uri	http://idr.nitk.ac.in/jspui/handle/123456789/7957	-
dc.description.abstract	Counting the number of occurences of a substring in a string is a problem in many applications. This paper suggests a fast and efficient solution for the field of bioinformatics. A k-mer is a k-length substring of a biological sequence. k-mer counting is defined as counting the number of occurences of all the possible k-mers in a biological sequence. k-mer counting has uses in applications ranging from error correction of sequencing reads, genome assembly, disease prediction and feature extraction. We provide a Hadoop based solution to solve the k-mer counting problem and then use this for classification of multi-genomic data. The classification is done using classifiers like Naive Bayes, Decision Tree and Support Vector Machine(SVM). Accuracy of more than 99% is observed. � 2015 IEEE.	en_US
dc.title	Classification of multi-genomic data using MapReduce paradigm	en_US
dc.type	Book chapter	en_US
Appears in Collections:	2. Conference Papers

Files in This Item:

There are no files associated with this item.