- e-Sword Resources
- BeST
- Reference & Training
- Forums
- Blogs
- Support Us
Tool Request
Thu, 03/11/2010 - 11:38
Hello
I ‘m wondering if a tool could be created to check verse structure in a txt file being prepped for conversion to E-Sword
In a normal Old and New Testament Bible there are 31,102 verses
Each book has a set amount of Chapters and each chapter has a set amount of verses
Here is what I‘m encountering I create from PDF files using OCR software on old Bibles the text tends to smear slightly and causes 8 to come out as 3 and 5 as 6 and so on
To top that off the Bible that I am working on in part is from the vulgate which does not follow the KJV verse structure
correct
1Kings 8:1
1Kings 8:2
1Kings 8:3
1Kings 8:4
1Kings 8:5
1Kings 8:6
1Kings 8:7
1Kings 8:8
1Kings 8:9
1Kings 8:10
1Kings 8:11
1Kings 8:12
1Kings 8:13
1Kings 8:14
1Kings 8:15
1Kings 8:63
1Kings 8:64
1Kings 8:65
1Kings 8:66
Incorrect verses
1Kings 3:1
1Kings 8:2
1Kings 8:8
1Kings 8:4
1Kings 8:6
1Kings 8:6
1Kings 8:7
1Kings 8:8
1Kings 8:9
1Kings 8:10
1Kings 8:11
1Kings 8:12
1Kings 8:13
1Kings 8:14
1Kings 8:15
1Kings 8:63
1Kings 8:64
1Kings 8:65
1Kings 8:66
1Kings 8:67
The tool would read TXT files and as many other formats as the programmer can make it
The tool would check the books in the Bible and tell you if a book is missing or misspelled and where
The tool would check the chapters in a Bible and tell you if you are missing or have to many and where
The Tool would check the verses in a Bible and tell you if you are missing or have to many and where
In short it would verify that the file being checked is a match to the KJV verse structure and any deviation would be reported with location so it can be examined and fixed
God Bless and Keep You
David
Thu, 03/11/2010 - 12:50
#2
Re: Tool Request
I'm also writing a tool to compare two resources.
jonathon
What will your tool compare words or verse structure
David
Thu, 03/11/2010 - 16:12
#3
Re: Tool Request
DSaw wrote:
What will your tool compare words or verse structure
Neither/both.
Currently, it is a multi-stage process checker.
Stage One: Read the file that is plain text, and start a new line every time it encounters an arabic number. The output from this process is written to a file --- junk.bak.
This is the stage where the header, /header, crossref, and /crossref data is written to a commentary file.
Stage Two: Read junk.bak. Clean up PrAz, GreekEsther, Psalms 152-155, and any other Canonical books that e-Sword doesn't currently have slots for. The resulting file is written to working.bak. This is the stage where books are checked for the "correct" number of chapters, and each chapter is checked to verify that it has the "correct" number of verses. [Eventually, this stage will also convert other versification schemes to that used by e-Sword, prefixing each verse with what it was in the original versification scheme.]
Stage Three: Read working.bak. Compare the number of words in each line, with the number of words in the Greek TR & LXX, Latin Vulgate, English KJVA, and Hebrew MT. The results of this are written out to a file --- word_length.bak
Stage Four: Read word_length.bak, and write out a file issues.txt. This file contains a list of verses that are either 10% longer than expected, or 10% shorter than expected. (This section is both language, and writing system independent.) What it does, is calculates the difference between the words in the line, and the number of words in the texts it uses as "source documents", as a percentage. It then calculates the average percentage of the text, and lines that deviate from that percentage by more than ten percent, are flagged as being "possible errors". This routine isn't perfect, but it is a good indication that text is in the "wrong" place.
Stage Five: Read working.bak. Write out concordance.txt and cross_ref.txt. concordance.text is an alphabetical list of every word in the text, ready for importing into a dictionary. cross_ref.txt is a listing of every word in the text, sorted by verse, ready for importing into a commentary file. (I'd better explain this. For Gen 1:1, it lists all of the words used in that verse, alphabetically, and next to each word, a list of where else that word is used. EG: For John 1:1, it is "sample_word John 1:1, John 1:2, John 1:5, etc".) Think of it is a concordance for every word in the verse. Or, as I call it _The Concordance At The Bottom of the Page_. For a single Bible, it isn't that useful. But when four or five translations are merged together, especially if Biblical Languages are included, it helps clarify --- or as I did last night, refute claims made about the usage and/or meaning of the word. (Last night's example was the claim that "Baptiso", used for "washing hands" in Mark 7:3. Both TR and NA-26 uses "Nipto" there, not "Baptiso". ["Nipto" usually refers to "cleaning, as an ablution". "Baptiso" is used in Mark 7:4, for washing couches.])
Stage Six: Read working.bak. Write out working.txt. This file contains requested Red Lettering (Words of Jesus. This file is read in. Words of God, and Words of the Holy Spirit can be easily added/included. For that matter, if you want to colour every word spoken by an individual, that can be done. Just write the appropriate text file.), Pericope headings (English only. This file is read in. Other languages can be easily substituted. Just write the appropriate text file.) if requested, and and various other presentation markup formatting.
jonathon
Thu, 03/11/2010 - 17:11
#4
Re: Tool Request
If I understand your Description it would solve my problem in part
it will flag differences from the Vulgate and show where to examine and fix the txt file
here is what I'm not sure about does it only check for correct amount of verses in a chapter or does it check numerical sequence in a chapter
converting from PDF sometimes I get 3's that get processed as 8's so i would have correct amount of verses in a chapter but two would have the same number
or would this tool do that also
and would it work on window XP sp3
God Bless and Keep You
David
Thu, 03/11/2010 - 18:44
#5
Re: Tool Request
here is what I'm not sure about does it only check for correct amount of verses in a chapter or does it check numerical sequence in a chapter
Both/neither
It writes the verse out as a line.
If it looks like some verses are incorrectly labelled, it assumes a versification scheme difference, and adds the incorrect label to the verse, and relabels the verse in the correct sequence.
In your instance, in working.txt, you'd see
1Kings 8:1 (I Kings 3:1) impsum lorem
1Kings 8:2 impsum lorem
1Kings 8:8 impsum lorem
1Kings 8:4 impsum lorem
1Kings 8:5 (I Kings 8:6) impsum lorem
1Kings 8:6 impsum lorem
1Kings 8:7 impsum lorem
and in issues.txt, you would see
1Kings 8:1 versification change
1Kings 8:5 versification change
and would it work on window XP sp3
Cross platform. It should run on *Nix, Mac OS X, and Windows equally well.
After this tool is released, I'll be writing a cross-platform tool that converts text files to e-Sword resources.(Part of the reason I started writing e-Sword documentation, was to have the data I needed, to write a utility program for creating e-Sword resources.]
jonathon
Thu, 03/11/2010 - 19:07
#6
Re: Tool Request
It sounds like just what I need
any chance of getting a beta version I'm not a programmer so I will need some sort of GUI
or detailed instructions
God Bless and Keep You
David
Thu, 03/11/2010 - 19:54
#7
Re: Tool Request
any chance of getting a beta version I'm not a programmer so I will need some sort of GUI
For all practical purposes it doesn't have a GUI. Start it, write the name of the source file, and let it run.
It is not yet ready for outside beta testing.
jonathon
Fri, 03/12/2010 - 21:15
#8
Re: Tool Request
jonathan
thats OK! Found a program online that solves my problems saved me 4 days of rechecking 31,000 plus verses for 12 missing verses
ran it and showed me right where to look
God Bless and Keep You
David
Tue, 03/16/2010 - 00:37
#9
Re: Tool Request
David,
Could you please provide an address where you found this program?
In His Name.
Tue, 03/16/2010 - 11:28
#10
Re: Tool Request
Hi Johan
Here are two links to check out also you can run a search on the web using keywords FILE COMPARE SOFTWARE
This one is free but lacks some good features BC3 has but you most likely will not miss them http://winmerge.org/
This one is free to try but cost 30$ after 30 days is it worth it you decide
http://download.cnet.com/Beyond-Compare/3000-2242_4-10015731.html
God Bless and Keep You
David
Thu, 03/18/2010 - 06:18
#11
Re: Tool Request
Thank you, much appreciated.
Thu, 03/18/2010 - 10:59
#12
Re: Tool Request
Your Welcome
I'm rewriting a sanity checker.
* It reads in plain text only;
* It writes out files in B.C.V format;
* Text within and is rewritten to a file for importing as a commentary, with the being replaced with the markup for footnotes;
* Text within is moved to the beginning of the next verse, if it is at the end of the verse;
I'm also writing a tool to compare two resources.
jonathon
For your prayers are like a solid pillar in its midst, and like an indestructible wall surrounding it. (4 Baruch 1:2)