Faster way then append from SDF

Posted: Thu Feb 14, 2019 8:02 pm
by Marc Vanzegbroeck

In an application, I read a text-file with more then 1300000 lines, in a database with one field.
After reading it, I go thrue the file, and depending of the information on each line, I extract information then I need into another table.
It's working fine, but the process take a while.

I found out, that only read the file with 'APPE FROM .... SDF take more than 30sec.
I the a faster way to read the information, and check the content of each line?


Posted: Fri Feb 15, 2019 8:07 am
by hmpaquito

Don´t write, only read.

Use FileEval() function from ... FILEIO.PRG


Posted: Fri Feb 15, 2019 11:14 am
by Marc Vanzegbroeck
Thank you, I wiil try it

Posted: Fri Feb 15, 2019 4:45 pm
by Jack
Did you try this ?

wnbl:=oTxt:recCount() && number of line
if wnbl>0
for wbcl = 1 to wnbl

Try it .
Philippe from Belgium

Posted: Fri Feb 15, 2019 5:45 pm
by Marc Vanzegbroeck

Thank you for the suggestion, but unfortunately, it's slower.
It took 64sec to read it. The 'append from' was 35sec.

Posted: Fri Feb 15, 2019 6:22 pm
by nageswaragunupudi
Will you please try this also?

Code: Select all

#include ""

#define BUFSIZE   64000
#define EOLCHR    CRLF
#define EOLLEN    2

static cBuf    := ""
static nStart  := 1
static lEof    := .f.


function Main()

   local cFile, hFile, cLine
   local alines   := {}  // for testing
   local nLine    := 0   // for testing
   local nSecs

   cFile    := "c:\fwh\source\classes\xbrowse.prg"  // your file here
   hFile    := FOpen( cFile, 64 )

   ? "Start"

   nSecs    := SECONDS()
   do while ( cLine := ReadLine( hFile ) ) != nil
      // do your work here with cLine
      nLine++                            // for testing
      AAdd( aLines, { nLine, cLine } )   // for testing
   FClose( hFile )

   ? "Seconds:", SECONDS() - nSecs

   XBROWSER aLines SHOW RECID // Testing
   ? "Done"

return nil


static function ReadLine( hFile )

   local cLine, nRead, nEolAt

#ifdef __XHARBOUR__
   nEolAt      :=    At( EOLCHR, cBuf, nStart )
   nEolAt      := HB_At( EOLCHR, cBuf, nStart )
   if nEolAt == 0
      if !lEof
         cBuf     := SubStr( cBuf, nStart ) + ReadBuf( hFile, @lEof )
         nStart   := 1

#ifdef __XHARBOUR__
         nEolAt   :=    At( EOLCHR, cBuf, nStart )
         nEolAt   := HB_At( EOLCHR, cBuf, nStart )
   if nEolAt > 0
      cLine    := SubStr( cBuf, nStart, nEolAt - nStart )
      nStart   := nEolAt + EOLLEN
      if nStart <= Len( cBuf )
         cLine    := SubStr( cBuf, nStart )
      cBuf  := ""

return cLine


static function ReadBuf( hFile, lEof )

   local cBuf  := SPACE( BUFSIZE )
   local nRead

   nRead    := FRead( hFile, @cBuf, BUFSIZE )
   if nRead < BUFSIZE
      lEof  := .t.
      cBuf  := Left( cBuf, nRead )

return cBuf


Posted: Fri Feb 15, 2019 6:28 pm
by James Bott
Maybe you could create a conditional index containing only the records you need. Then you don't even have to move the records to another file. Time required to get the records you want, zero.

Just thinking outside the box.

Posted: Fri Feb 15, 2019 10:55 pm
by Marc Vanzegbroeck

Your example is very fast :lol: Only 2sec :shock:

I tested the loop without the line

Code: Select all

AAdd( aLines, { nLine, cLine } )
With this line it take 138sec :shock: :shock: :shock: .

So the aadd() take a lot of time.
That's no problem, because I can direcly do the content check, and only at the information I need.

The problem was not going through the database, but reading the text-file. Otherwhise the condition index was a good idea.

Posted: Fri Feb 15, 2019 11:26 pm
by Rick Lipkin

Here is a snipit from a Find and replace file utility I wrote that opens any text file you like .... hope you get the gist on how to use ..

Rick Lipkin

Code: Select all

Do While .not. Eof()
   cFILE := ALLTRIM( A->FileName )

   If ( nHANDLE := Fopen( cFILE )) > 0
      SAYING := "ERROR Reading file "+cFILE
      YESNO  := { "Skip", "Abort" }

      nOK := Alert(SAYING,YESNO, )

      DO CASE
      CASE nOK = 1      // skip
           SELECT 1
           lAborted := .f.
      CASE nOK = 2      // abort
           Select 1
           Select 1

   Text_Eof   := FSEEK( nHANDLE, 0, 2 )  // get eof value
   Bytes_Read := 0
   Line       := 0

   Fseek( nHANDLE, 0 )                // rewind to bof()

   For Bytes_read = 0 to Text_eof

       Bytes_begin := FSEEK( nHANDLE, 0, 1)
       cText       := _FreadLine( nHANDLE )        // this is what extracts the text 

       cText := upper( cText )
       cOldText    := cText

    *   msginfo( "cText" )
    *   msginfo( cText )

       DO CASE
       CASE cTEXT = ">EOF<"  .and. Bytes_read = 0
            // do nothing and go on BOF //
       CASE cTEXT = ">EOF<"  .and. Bytes_read > 0

       bytes_read := FSEEK( nHANDLE, 0, 1 )

       cText := cText+"   Line "+str(Line,5)

       cText := cOldText

       nFOUND := AT( cTextFind, cTEXT )

       If nFOUND > 0

Static FUNC _FreadLine( nHANDLE, LINE_LEN )



LINE_END  := AT( CHR(13)+CHR(10), BUFFER )

IF LINE_END = 0           // no cr or must be eof() or bof()

    FSEEK( nHANDLE, 0 )    // go back to top
    RETURN( ">EOF<" )


   FSEEK ( nHANDLE, (NUM_BYTES * -1) + LINE_END+1, 1 ) // move to next line


// eof freadline


Posted: Sat Feb 16, 2019 1:33 am
by James Bott

The problem was not going through the database, but reading the text-file. Otherwhise the condition index was a good idea.

Well, I was thinking that you could eliminate the export/import and just use the original file with the index set.

I don't know exactly what you doing with the new file you are creating, but if you are just processing it, then you can just use the original file indexed.

oDBF:setIndex( cWhatever )

Less is more...


Posted: Sat Feb 16, 2019 2:52 am
by nageswaragunupudi
I don't know exactly what you doing with the new file you are creating, but if you are just processing it, then you can just use the original file indexed.
The original file is a text file. Not a dbf file.
Do you think we can create an index on the text file?

Posted: Sat Feb 16, 2019 4:05 am
by nageswaragunupudi
Mr. Marc
That's no problem, because I can direcly do the content check, and only at the information I need.
Yes, that is the idea.
Earlier you were spending around 30 seconds to transfer the data from the text file to a dbf file and then starting your process. Now, this 30 secs is reduced to 2 seconds. You can straight away start processing without waiting for the conversion.

Even these 2 seconds time can be reduced further:
1) The function was written for variable length records. Because you are using APPEND FROM .. SDF, I guess the records in the text file are all of a fixed length. If we modify the program to fixed length records of known record length, the reading would be even much faster.
2) Converting the functions from Harbour to C makes it even faster.

But the additional work may not worth just to save 1 second more.