IJSREG Trion Studio

No Publication Cost

Vol 10, No 2:

subscription

Design of Stochastic Model for Mutated Malignant Gene Sequences using Genomic Data Mining
Abstract
This paper mainly deals with detecting the disease earlier by the gene sequences. A novel machine learning approach is ventured to predict the gene sequences; accompanied by the conventional prediction methods like Markov chain, Hidden Markov Model. This work attempts to predict the malignancy before it happens through mutated TP53 gene sequences. For the performance verification of Markov chain, the existence of CpG islands, fitting of distribution has been implemented. As the extension of Markov chain, the HMM is introduced; the hidden states and their corresponding probabilities are estimated. The optimal path for each sequence is determined. To identify the probabilities of each hidden and visible state, the profile HMM is implemented. PHMM is to provide the indication of how well a new sequence fit the model and whether the sequence may be related to the sequence used to train the model. The performance of the trained model is compared against the previously existing models.
Full Text
PDF
References

1. A. Lees; T. Sessler and S. McDade; Dying to Survive-The p53 Paradox. Cancers, 13(13), p. 3257 (2021). DOI: 10.3390/cancers13133257.
2. A. Petitjean; M.I.W. Achatz; A.L. Borresen-Dale; P. Hainaut and M. Olivier; TP53 Mutations in Human Cancers: Functional Selection and Impact on Cancer Prognosis and Outcomes. Oncogene, 26(15), 2157-2165 (2007).
3. B.J. Yoon; Hidden Markov Models and Their Applications in Biological Sequence Analysis. Current Genomics, 10(6), 402-415 (2009).
4. B.K. Sarkar; Entropy Based Biological Sequence Study. Entropy and Exergy in Renewable Energy, IntechOpen, (2021). DOI: 10.5772/intechopen.96615.
5. B. Mor; S. Garhwal and A. Kumar; A Systematic Review of Hidden Markov Models and Their Applications.  Archives of Computational Methods in Engineering, 28(3), 1429-1448 (2021).
6. B. Schuster‐Böckler and A. Bateman; An Introduction to Hidden Markov Models. Current Protocols in Bioinformatics, 18(1), A.3A.1 - A.3A.9 (2007).
7. C. Roth; Statistical Methods for Biological Sequence Analysis for DNA Binding Motifs and Protein Contacts. Doctoral Dissertation, Georg-August University, (2021).
8. G.B. Singh; Introduction to Bioinformatics. Fundamentals of Bioinformatics and Computational Biology, Springer, 3-10 (2015).
9. I. Mandoiu and A. Zelikovsky; Bioinformatics Algorithms: Techniques and Applications. John Wiley & Sons (2008).
10. J. Böer; Multiple Alignment using Hidden Markov Models. Seminar Hot Topics in Bioinformatics, 4, 14 (2016).
11. J.B. Procter; G.M. Carstairs; B. Soares; K. Mourão; T.C. Ofoegbu; D. Barton; L. Lui; A. Menard; N. Sherstnev; D. Roldan-Martinez; S. Duce; D.M.A. Martin and G.J. Barton; Alignment of Biological Sequences with Jalview. Methods Mol. Biol., 2231, 203-224 (2021).
12. J.D.P. Muthu and S.K. Kaliyaperumal; Markov Modelling for Mucoviscidosis using Genomic Data. European Journal of Mathematics and Statistics, 3(6), 27-34 (2022).
13. J.E. Gentleman and R.C. Mullin; The Distribution of the Frequency of Occurrence of Nucleotide Subsequences, Based on their Overlap Capability. Biometrics, 45(1), 35-52 (1989).
14. J. Li; J.Y. Lee and L. Liao; A New Algorithm to Train Hidden Markov Models for Biological Sequences with Partial Labels. BMC Bioinformatics, 22(1), 1- 21 (2021).
15. K.S. Kannan and S.D. Jeniffer; (2023) Hidden Markov Modelling for Biological Sequence. In Proceedings of International Conference on Computational Intelligence, Singapore, 383-403.
16. L.B. Alexandrov; S. Nik-Zainal; D.C. Wedge; S.A. Aparicio; S. Behjati; A.V. Biankin and M.R. Stratton; Signatures of Mutational Processes in Human Cancer. Nature, 500(7463), 415-421 (2013).
17. L.R. Rabiner; (1989) A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE, 77(2), 257-286.
18. M. Stanke; O. Schöffmann; B. Morgenstern and S. Waack; Gene Prediction in Eukaryotes with a Generalized Hidden Markov Model that Uses Hints from External Sources. BMC Bioinformatics, 7(1), 1-11 (2006).
19. P.A. Gagniuc; Markov Chains: From Theory to Implementation and Experimentation. John Wiley & Sons (2017).
20. R. Durbin; S.R. Eddy; A. Krogh and G. Mitchison; Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press (1998).
21. R.P. Bonidia; D.S. Domingues; D.S. Sanches and A.C. de Carvalho; MathFeature: Feature Extraction Package for DNA, RNA and Protein Sequences based on Mathematical Descriptors. Briefings in Bioinformatics, 23(1), p. bbab434 (2021). DOI: https://doi.org/10.1093/bib/bbab434
22. S.C. Rastogi; P. Rastogi and N. Mendiratta; Bioinformatics Methods and Applications: Genomics, Proteomics and Drug Discovery, PHI Learning Private Limited, New Delhi (2008).
23. S.D. Jeniffer and K.S. Kannan; Stochastic Modelling for Identifying Malignant Diseases. Advances and Applications of Mathematical Sciences, 20(9), 1923-1936 (2021).
24. S. Kumar and S.R. Gadagkar; Disparity Index: A Simple Statistic to Measure and Test the Homogeneity of Substitution Patterns between Molecular Sequences. Genetics, 158(3), 1321-1327 (2001).
25. S.K. Sasidharan and C. Thomas; ProDroid—An Android Malware Detection Framework based on Profile Hidden Markov Model. Pervasive and Mobile Computing, 72, p. 101336 (2021).
26. S.R. Eddy; (1995) Multiple Alignment using Hidden Markov Models. In International Conference on Intelligent Systems for Molecular Biology, 3, 114-120.
27. T. Karuppusamy and M. Sivasubramanian; Biological Gene Sequence Stucture Analysis using Hidden Markov Model. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 12(4), 1652-1666 (2021).
28. V.D. Fonzo; F. Aluffi-Pentini and V. Parisi; Hidden Markov Models in Bioinformatics. Current Bioinformatics, 2(1), 49-61 (2007).
29. V. Deneshkumar; M. Manoprabha and K.S. Kannan; Multiple Sequence Alignment with Hidden Markov Model for Diabetic Genome. TEST Engineering & Management, 83, 1235-1242 (2020).
30. V. Simossis; J. Kleinjung and J. Heringa; An Overview of Multiple Sequence Alignment. Current Protocols in Bioinformatics, 3(1), 3-7 (2003).
31. W.C. Krumbein and M.F. Dacey; Markov Chains and Embedded Markov Chains in Geology. Journal of the International Association for Mathematical Geology, 1(1), 79-96 (1969).
32. Y. Meng and J. Fei; Hidden Service Publishing Flow Homology Comparison using Profile‐Hidden Markov Model. International Journal of Intelligent Systems, 37(2), 1081-1112 (2022).

ISSN(P) 2350-0174

ISSN(O) 2456-2378

Journal Content
Browser