YS

Hello
I found a method in the help
"getFilePath ( String DefName, String DefExt, String DefDir, String Title, int Options )",
it provides an array of selected files.
Also in the help I found the method
"getData ( )",
which produces an array of bytes (which is what I want).
Using a report, I am trying to output an array of bytes to a text file.
Script code:

function getImportFile() {
        var selFile = Dialogs.getFilePath("",
                                           "*.pdf!!Adobe Acrobat Files|*.pdf|Text Files|*.txt||",
                                           "",
                                           "Select file to be opened",
                                           0)[0];
    Dialogs.MsgBox("You selected: " + selFile.getName());
    //the file's data is available via selFiles[i].getData() which returns the content as a byte[].
       if (selFile != null) {
              return selFile;
       }
       return null;
}

function main() {
     var importFile = getImportFile();
     if (importFile != null) {
          var fileData = importFile.getData(); //the file's data is available via getData() which returns the content as a byte[].
          oOutput = Context.createOutputObject();
          oOutput.OutputTxt(fileData);          
     }
     oOutput.WriteReport();
}

main();

When debugging this script, I check the contents of the "fileData" variable, the data is correct (see pic. "ScreenShotOfValues.jpg").
However, when outputting a text file, its content (see pic. "ScreenShotOfOutput.jpg") contains data that does not resemble the contents of a pdf file (see pic "ScreenShotOfPdfContent.jpg").
When using additional function:

function bin2string(array){
    var result = "";
    for(var i = 0; i < array.length; ++i){
    result+= (String.fromCharCode(array[i]));
    }
    return result;
}

The content of the resulting text file is similar to the content of the pdf file that I opened in notepad, but after saving it as a PDF file, Adobe Reader cannot open that PDF file.


Tell me please how I can get the contents of a pdf file as an array of bytes.

by M. Zschuckelt
Posted on Fri, 09/03/2021 - 12:35

Hi,

as far as I know, a pdf file is (at least in parts) binary data. so you should not experiment with converting the file content using String methods like OutputTxt or String.fromCharCode. The most you can do is reading such a file from somwhere and writing it as-is to somewhere else. With the String methods you always run into trouble that they try to interpret the bytes using some encoding like your local Windows codepage or UTF-8 and mapping that to characters. This may fail occasionally, when you have binary data, which is invalid in the encoding, so you lose information this way. With further transformations back to binary this information cannot be recovered. Also it is the question, if the mapping String -> File is done using the same encoding as was used for the interpretation.

Example: String.fromCharCode expects a Unicode code point between U+0 and U+FFFF. But you are giving it a single byte. If your source file were encoded in UTF-8, a single unicode character of a code-point beyond U+7F would show up in your file as 2 or more bytes. This is something your method does not take into consideration. You assume the byte value is the Unicode code-point.

0
by Yuriy Stolyarov Author
Posted on Tue, 09/07/2021 - 10:13

In reply to by M. Zschuckelt

Hello M. Zschuckelt

Thank you very much for your response! I needed to read the pdf file into an array of bytes and then, encode them in Base64, then save them to txt.
The second action I needed to open this txt-file, read the bytes, decode them, and then create a pdf file and write the received (decoded) bytes there.
Text file with encoded bytes created without problems.
Then I found a topic:
https://www.ariscommunity.com/users/jhroza/2015-03-04-create-excel-2010-xlsx-format-file-script
So I used the getCurrentPath() and getSelectedFile() methods (after the WriteReport() method), I took the generated file, load it to Java.
In Java, used FileOutputStream to overwrite a file, writing decoded bytes into it, saved this file.
This decoded pdf file opened successfully.

P.S.

This report is a check for further action - creating an xml file and adding a PDF file to it as a code in Base64 format

0

Featured achievement

Rookie
Say hello to the ARIS Community! Personalize your community experience by following forums or tags, liking a post or uploading a profile picture.
Recent Unlocks

Leaderboard

|
icon-arrow-down icon-arrow-cerulean-left icon-arrow-cerulean-right icon-arrow-down icon-arrow-left icon-arrow-right icon-arrow icon-back icon-close icon-comments icon-correct-answer icon-tick icon-download icon-facebook icon-flag icon-google-plus icon-hamburger icon-in icon-info icon-instagram icon-login-true icon-login icon-mail-notification icon-mail icon-mortarboard icon-newsletter icon-notification icon-pinterest icon-plus icon-rss icon-search icon-share icon-shield icon-snapchat icon-star icon-tutorials icon-twitter icon-universities icon-videos icon-views icon-whatsapp icon-xing icon-youtube icon-jobs icon-heart icon-heart2 aris-express bpm-glossary help-intro help-design Process_Mining_Icon help-publishing help-administration help-dashboarding help-archive help-risk icon-knowledge icon-question icon-events icon-message icon-more icon-pencil forum-icon icon-lock