Hello
I found a method in the help
"getFilePath ( String DefName, String DefExt, String DefDir, String Title, int Options )",
it provides an array of selected files.
Also in the help I found the method
"getData ( )",
which produces an array of bytes (which is what I want).
Using a report, I am trying to output an array of bytes to a text file.
Script code:
function getImportFile() {
var selFile = Dialogs.getFilePath("",
"*.pdf!!Adobe Acrobat Files|*.pdf|Text Files|*.txt||",
"",
"Select file to be opened",
0)[0];
Dialogs.MsgBox("You selected: " + selFile.getName());
//the file's data is available via selFiles[i].getData() which returns the content as a byte[].
if (selFile != null) {
return selFile;
}
return null;
}
function main() {
var importFile = getImportFile();
if (importFile != null) {
var fileData = importFile.getData(); //the file's data is available via getData() which returns the content as a byte[].
oOutput = Context.createOutputObject();
oOutput.OutputTxt(fileData);
}
oOutput.WriteReport();
}
main();When debugging this script, I check the contents of the "fileData" variable, the data is correct (see pic. "ScreenShotOfValues.jpg").
However, when outputting a text file, its content (see pic. "ScreenShotOfOutput.jpg") contains data that does not resemble the contents of a pdf file (see pic "ScreenShotOfPdfContent.jpg").
When using additional function:
function bin2string(array){
var result = "";
for(var i = 0; i < array.length; ++i){
result+= (String.fromCharCode(array[i]));
}
return result;
}The content of the resulting text file is similar to the content of the pdf file that I opened in notepad, but after saving it as a PDF file, Adobe Reader cannot open that PDF file.
Tell me please how I can get the contents of a pdf file as an array of bytes.
M. Zschuckelt on
Hi,
as far as I know, a pdf file is (at least in parts) binary data. so you should not experiment with converting the file content using String methods like OutputTxt or String.fromCharCode. The most you can do is reading such a file from somwhere and writing it as-is to somewhere else. With the String methods you always run into trouble that they try to interpret the bytes using some encoding like your local Windows codepage or UTF-8 and mapping that to characters. This may fail occasionally, when you have binary data, which is invalid in the encoding, so you lose information this way. With further transformations back to binary this information cannot be recovered. Also it is the question, if the mapping String -> File is done using the same encoding as was used for the interpretation.
Example: String.fromCharCode expects a Unicode code point between U+0 and U+FFFF. But you are giving it a single byte. If your source file were encoded in UTF-8, a single unicode character of a code-point beyond U+7F would show up in your file as 2 or more bytes. This is something your method does not take into consideration. You assume the byte value is the Unicode code-point.