Verse similarity
From Wiki
The Qur'an is characterized by verses that are similar as Allah said in 39:23: اللَّهُ نَزَّلَ أَحْسَنَ الْحَدِيثِ كِتَابًا مُّتَشَابِهًا Allah has sent down the best statement: a consistent Book wherein is reiteration.
There are traditional books authored in this subject, a list of them can be found here.
This page describes some of the computational attempts to capture such re-iterated verses in the Qur'an.
Contents |
Related Verses from Ibn Kathir
I have attempted to compile a dataset of related verses from Tafseer Ibn Kathir. This resulted in nearly 8,000 pairs of related verses. In certain cases, It might not be very obvious to discover how they became related without looking into the context that made Ibn Kathir related them. I then went into capturing the indirect relations as well. If verse x was directly related with y according to Ibn Kathir. Then I checked the relations of y and made x become indirectly related to all these verses. In this way I have above 32,000 pairs of verses that are related indirectly.
Eventually, I want to use this dataset - after perhaps pruning and cleaning to only obvious relations - as a gold standard for machine learning tasks: like computational relation detection.
Try this application here where you can enter a particular verse to see all directly and indirectly related verse from Ibn Kathir.
For more elaboration on this dataset read Verse relatedness in Ibn Kathir
PhP Similar_Text function
PhP similar_text is a function to calculate the similarity between two strings based on shared characters.
int similar_text ( string $first , string $second [, float &$percent ] )
I have implemented this function in the following link. The ajax stuff was adopted from this tutorial.
Text::Similarity::Overlaps Module
This is a perl module developed by Ted Pedersen and is available here. I have used this function for finding similar verses here.
Vector Similarity
I have considered using vector space model and calculated tf-idf values for each content word of the Quran, and created this application. Given the relative smaller size of the Quranic verses, this does not seem very helpful compared with other measures.
Verse Segment Similarity
Often a bigger verse can be segmented into multiple natural segments. Because, many of the word similarity programs just check keyword matches, comparing these smaller verse segments bring more hits.
I have segmented the verses based on pause markes available under Medina Qur'an [see QuranComplex.com]. Then I used the same Text::Similarity module to find similar verses. This implementation is available here. Alos, php similar_text implementation is availabe here.
As a comparison see the verses similar to 2:80 in both cases with a precision of 50%. The verse similarity brings no results, while segment similarity brings 8 verses.
Benchmarking lexical similarity measures with this Dataset
Following table provides recall rates of few lexical similarity measures against the dataset collected from Ibn Kathir. That means, if Ibn Kathir branded a pair of verses to be related, how good these measures are to brand these pairs as related too.
| Lexical Similarity Method | All pairs (7,679) | Strong Relations - Level 2- (3,079) | Pairs with common roots (4,896) |
| Text:Similairty:Overlaps Module by Ted Pedersen (precision >= 50%) | 92 pairs (1.2%) | 71 pairs (2.3%) | 84 pairs (2.7%) |
| Php Similar_Text function (precision >=50%) | 612 pairs (8%) | 441 pairs (14%) | 512 pairs (10.5%) |
| Availability of common roots | 4,896 pairs (64%) | 2590 pairs (84.1%) | n/a (100%) |
| Availability of common pronouns(*) | 3,095 pairs (40.3%) | 1488 pairs (48.3%) | 2,251 pairs (46%) |
(*) We found that among 489 pairs that are strongly related and dos NOT share common keywords, 192 pairs (39.3%) had common pronouns, and among the 2,783 pairs that has NO common keyword, 844 pairs (30.3%) has common pronoun concepts.
Traditional Books
There are traditional books authored in this subject, a list of them can be found here. Following are available on-line.
- By as-Sakhawi and its explanation download link.
- Al-Burhan by al-Karmani download link.
