Tool Request

12 replies [Last post]
DSaw
DSaw's picture
User offline. Last seen 4 hours 33 min ago. Offline
Resource Builder
Joined: 02/26/2009
Posts: 414

 
Hello

 
I ‘m wondering if a tool could be created to check verse structure in a txt file being prepped for conversion to E-Sword

 
In a normal Old and New Testament Bible there are 31,102 verses

 
Each book has a set amount of Chapters and each chapter has a set amount of verses
 
 
Here is what I‘m encountering I create from PDF files using OCR software on old Bibles the text tends to smear slightly and causes 8 to come out as 3 and 5 as 6 and so on

To top that off the Bible that I am working on in part is from the vulgate which does not follow the KJV verse structure
 
 

correct

1Kings 8:1
1Kings 8:2
1Kings 8:3
1Kings 8:4
1Kings 8:5
1Kings 8:6
1Kings 8:7
1Kings 8:8
1Kings 8:9
1Kings 8:10
1Kings 8:11
1Kings 8:12
1Kings 8:13
1Kings 8:14
1Kings 8:15
 
1Kings 8:63
1Kings 8:64
1Kings 8:65
1Kings 8:66
 

Incorrect verses
1Kings 3:1
1Kings 8:2
1Kings 8:8
1Kings 8:4
1Kings 8:6
1Kings 8:6
1Kings 8:7
1Kings 8:8
1Kings 8:9
1Kings 8:10
1Kings 8:11
1Kings 8:12
1Kings 8:13
1Kings 8:14
1Kings 8:15
 
1Kings 8:63
1Kings 8:64
1Kings 8:65
1Kings 8:66
1Kings 8:67


 
 
 
The tool would read TXT files and as many other formats as the programmer can make it
 
The tool would check the books in the Bible and tell you if a book is missing or misspelled and where
 
The tool would check the chapters in a Bible and tell you if you are missing or have to many and where
 
The Tool would check the verses in a Bible and tell you if you are missing or have to many and where
 
 
In short it would verify that the file being checked is a match to the KJV verse structure and any deviation would be reported with location so it can be examined and fixed
 
God Bless and Keep You
 
David

0
Your rating: None

The same thing said a different way sparks the understanding

David

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
bible.study.software
bible.study.software's picture
User offline. Last seen 7 hours 57 min ago. Offline
Joined: 03/26/2009
Posts: 414
Re: Tool Request

DSaw wrote:
I am wondering if a tool could be created to check verse structure in a txt file being prepped for conversion to E-Sword

I'm rewriting a sanity checker.
* It reads in plain text only;
* It writes out files in B.C.V format;
* Text within and is rewritten to a file for importing as a commentary, with the being replaced with the markup for footnotes;
* Text within is moved to the beginning of the next verse, if it is at the end of the verse;

I'm also writing a tool to compare two resources.

jonathon

For your prayers are like a solid pillar in its midst, and like an indestructible wall surrounding it. (4 Baruch 1:2)

DSaw
DSaw's picture
User offline. Last seen 4 hours 33 min ago. Offline
Resource Builder
Joined: 02/26/2009
Posts: 414
Re: Tool Request

bible.study.software wrote:
I'm also writing a tool to compare two resources.

 
jonathon
What will your tool compare words or verse structure
 
David

The same thing said a different way sparks the understanding

David

bible.study.software
bible.study.software's picture
User offline. Last seen 7 hours 57 min ago. Offline
Joined: 03/26/2009
Posts: 414
Re: Tool Request

DSaw wrote:
What will your tool compare words or verse structure

Neither/both.

Currently, it is a multi-stage process checker.

Stage One: Read the file that is plain text, and start a new line every time it encounters an arabic number. The output from this process is written to a file --- junk.bak.
This is the stage where the header, /header, crossref, and /crossref data is written to a commentary file.

Stage Two: Read junk.bak. Clean up PrAz, GreekEsther, Psalms 152-155, and any other Canonical books that e-Sword doesn't currently have slots for. The resulting file is written to working.bak. This is the stage where books are checked for the "correct" number of chapters, and each chapter is checked to verify that it has the "correct" number of verses. [Eventually, this stage will also convert other versification schemes to that used by e-Sword, prefixing each verse with what it was in the original versification scheme.]

Stage Three: Read working.bak. Compare the number of words in each line, with the number of words in the Greek TR & LXX, Latin Vulgate, English KJVA, and Hebrew MT. The results of this are written out to a file --- word_length.bak

Stage Four: Read word_length.bak, and write out a file issues.txt. This file contains a list of verses that are either 10% longer than expected, or 10% shorter than expected. (This section is both language, and writing system independent.) What it does, is calculates the difference between the words in the line, and the number of words in the texts it uses as "source documents", as a percentage. It then calculates the average percentage of the text, and lines that deviate from that percentage by more than ten percent, are flagged as being "possible errors". This routine isn't perfect, but it is a good indication that text is in the "wrong" place.

Stage Five: Read working.bak. Write out concordance.txt and cross_ref.txt. concordance.text is an alphabetical list of every word in the text, ready for importing into a dictionary. cross_ref.txt is a listing of every word in the text, sorted by verse, ready for importing into a commentary file. (I'd better explain this. For Gen 1:1, it lists all of the words used in that verse, alphabetically, and next to each word, a list of where else that word is used. EG: For John 1:1, it is "sample_word John 1:1, John 1:2, John 1:5, etc".) Think of it is a concordance for every word in the verse. Or, as I call it _The Concordance At The Bottom of the Page_. For a single Bible, it isn't that useful. But when four or five translations are merged together, especially if Biblical Languages are included, it helps clarify --- or as I did last night, refute claims made about the usage and/or meaning of the word. (Last night's example was the claim that "Baptiso", used for "washing hands" in Mark 7:3. Both TR and NA-26 uses "Nipto" there, not "Baptiso". ["Nipto" usually refers to "cleaning, as an ablution". "Baptiso" is used in Mark 7:4, for washing couches.])

Stage Six: Read working.bak. Write out working.txt. This file contains requested Red Lettering (Words of Jesus. This file is read in. Words of God, and Words of the Holy Spirit can be easily added/included. For that matter, if you want to colour every word spoken by an individual, that can be done. Just write the appropriate text file.), Pericope headings (English only. This file is read in. Other languages can be easily substituted. Just write the appropriate text file.) if requested, and and various other presentation markup formatting.

jonathon

For your prayers are like a solid pillar in its midst, and like an indestructible wall surrounding it. (4 Baruch 1:2)

DSaw
DSaw's picture
User offline. Last seen 4 hours 33 min ago. Offline
Resource Builder
Joined: 02/26/2009
Posts: 414
Re: Tool Request

If I understand your Description it would solve my problem in part
 
it will flag differences from the Vulgate and show where to examine and fix the txt file
 
here is what I'm not sure about does it only check for correct amount of verses in a chapter or does it check numerical sequence in a chapter
 
converting from PDF sometimes I get 3's that get processed as 8's so i would have correct amount of verses in a chapter but two would have the same number 
 
or would this tool do that also
 
and would it work on window XP sp3
 
God Bless and Keep You
 
David

The same thing said a different way sparks the understanding

David

bible.study.software
bible.study.software's picture
User offline. Last seen 7 hours 57 min ago. Offline
Joined: 03/26/2009
Posts: 414
Re: Tool Request

DSaw wrote:
here is what I'm not sure about does it only check for correct amount of verses in a chapter or does it check numerical sequence in a chapter

Both/neither

It writes the verse out as a line.
If it looks like some verses are incorrectly labelled, it assumes a versification scheme difference, and adds the incorrect label to the verse, and relabels the verse in the correct sequence.

In your instance, in working.txt, you'd see
1Kings 8:1 (I Kings 3:1) impsum lorem
1Kings 8:2 impsum lorem
1Kings 8:8 impsum lorem
1Kings 8:4 impsum lorem
1Kings 8:5 (I Kings 8:6) impsum lorem
1Kings 8:6 impsum lorem
1Kings 8:7 impsum lorem

and in issues.txt, you would see
1Kings 8:1 versification change
1Kings 8:5 versification change

Quote:
and would it work on window XP sp3

Cross platform. It should run on *Nix, Mac OS X, and Windows equally well.

After this tool is released, I'll be writing a cross-platform tool that converts text files to e-Sword resources.(Part of the reason I started writing e-Sword documentation, was to have the data I needed, to write a utility program for creating e-Sword resources.]

jonathon

For your prayers are like a solid pillar in its midst, and like an indestructible wall surrounding it. (4 Baruch 1:2)

DSaw
DSaw's picture
User offline. Last seen 4 hours 33 min ago. Offline
Resource Builder
Joined: 02/26/2009
Posts: 414
Re: Tool Request

It sounds like just what I need
 
any chance of getting a beta version I'm not a programmer so I will need some sort of GUI
or detailed instructions
 
God Bless and Keep You
 
David

The same thing said a different way sparks the understanding

David

bible.study.software
bible.study.software's picture
User offline. Last seen 7 hours 57 min ago. Offline
Joined: 03/26/2009
Posts: 414
Re: Tool Request

DSaw wrote:
any chance of getting a beta version I'm not a programmer so I will need some sort of GUI

For all practical purposes it doesn't have a GUI. Start it, write the name of the source file, and let it run.

It is not yet ready for outside beta testing.

jonathon

For your prayers are like a solid pillar in its midst, and like an indestructible wall surrounding it. (4 Baruch 1:2)

DSaw
DSaw's picture
User offline. Last seen 4 hours 33 min ago. Offline
Resource Builder
Joined: 02/26/2009
Posts: 414
Re: Tool Request

jonathan
 
thats OK! Found a program online that solves my problems saved me 4 days of rechecking 31,000 plus verses for 12 missing verses
 
ran it and showed me right where to look
 
God Bless and Keep You
 
David

The same thing said a different way sparks the understanding

David

Johan
Johan's picture
User offline. Last seen 18 weeks 2 days ago. Offline
Joined: 11/19/2008
Posts: 20
Re: Tool Request

David,

Could you please provide an address where you found this program?

In His Name.

Johan

DSaw
DSaw's picture
User offline. Last seen 4 hours 33 min ago. Offline
Resource Builder
Joined: 02/26/2009
Posts: 414
Re: Tool Request

Hi Johan
 
Here are two links to check out also you can run a search on the web using keywords FILE COMPARE SOFTWARE
 
 
This one is free but lacks some good features BC3 has but you most likely will not miss them http://winmerge.org/
 
 
 
This one is free to try but cost 30$ after 30 days is it worth it you decide
http://download.cnet.com/Beyond-Compare/3000-2242_4-10015731.html
 
 
God Bless and Keep You
 
David

The same thing said a different way sparks the understanding

David

Johan
Johan's picture
User offline. Last seen 18 weeks 2 days ago. Offline
Joined: 11/19/2008
Posts: 20
Re: Tool Request

Thank you, much appreciated.

Johan

DSaw
DSaw's picture
User offline. Last seen 4 hours 33 min ago. Offline
Resource Builder
Joined: 02/26/2009
Posts: 414
Re: Tool Request

Your Welcome

The same thing said a different way sparks the understanding

David

Syndicate content