Can't write a single row from a RowSet to file

Post any question you may have in regards to GoAnywhere Director and let our talented support staff and other users assist you.

Can't write a single row from a RowSet to file

Postby asmund » Mon Mar 24, 2014 9:45 am

I'm trying to download a bunch of files, remove some lines from each file, and save them all in one file.

The code below works. It runs by echoing each line through bash and echo, but is extremely slow, order of one second per line. I tried using the writeFixedWidth task which is disabled in the code below, but it didn't work as expected. It consumed the entire rowset on the first iteration, so that my line-number checking never ran, and the lines I wanted to exclude were included in the file.

Is this a bug or a feature? How can I do this within Director?

I'm using Director version 4.1.1.


Code: Select all
<project name="Geosat solar F10.7 flux" mainModule="Main" version="2.0">
   <description>Download Geosat F10.7 solar flux data</description>

   <module name="Main">

      <ftp label="FTP to NGDC" resourceId="ftp.ngdc.noaa.gov" version="1.0" disabled="false">
         <get label="Get files" destinationDir=" /tmp" whenFileExists="overwrite" destinationFilesVariable="downloaded_files">
            <fileset dir="/STP/space-weather/solar-data/solar-features/solar-radio/noontime-flux/penticton/penticton_observed/tables/">
               <wildcardFilter>
                  <include pattern="drao_noontime-flux-observed_199?.txt" caseSensitive="false" />
                  <include pattern="drao_noontime-flux-observed_20*.txt" caseSensitive="false" />
               </wildcardFilter>
            </fileset>
         </get>
      </ftp>


      <print label="print ${downloaded_files}" version="1.0" disabled="true">
         <![CDATA[Downloaded:
${downloaded_files}]]>
      </print>


      <rename label="Move old merged file out of the way" inputFile=" /tmp/merged.txt" newName="merged 2.txt" whenFileExists="rename" version="1.0" executeOnlyIf="${FileInfo(&quot; /tmp/merged.txt&quot;):exists}" />

      <forLoop label="Loop over years" beginIndex="1992" endIndex="2001" step="1" currentIndexVariable="year" disabled="false">

         <readFlatFile label="Read a F10.7 file" outputRowSetVariable="input_file" recordDelimiter="LF" processedInputFilesVariable="filename" version="1.0">
            <fileset dir=" /tmp">
               <wildcardFilter>
                  <include pattern="*${year}.txt" />
               </wildcardFilter>
            </fileset>
         </readFlatFile>


         <print label="Print processing file" version="1.0">
            <![CDATA[Processing ${filename}]]>
         </print>

         <forEachLoop label="Loop over file lines" itemsVariable="${input_file}" currentItemVariable="line" currentIterationVariable="lineno">

            <print label="print lineno" version="1.0">
               <![CDATA[lineno ${lineno}]]>
            </print>


            <setVariable label="line_deleted = False" name="line_deleted" value="False" version="2.0" />

            <if label="If line number is one to be deleted" condition="${lineno == 2 or lineno == 3 or lineno == 4 or lineno == 5 or lineno &gt; 41}">

               <print label="print deteted line number" version="1.0">
                  <![CDATA[deleted line ${lineno}]]>
               </print>


               <setVariable label="line_deleted = True" name="line_deleted" value="True" version="2.0" />

            </if>
            <if label="else" condition="${line_deleted == False}">

               <setVariable label="set linetext" name="linetext" value="${line[1]}" version="2.0" disabled="true" />


               <print label="print line" version="1.0" disabled="false">
                  <![CDATA[using ${lineno}, ${line[1]}
line: ${line}]]>
               </print>


               <writeFixedWidth label="Write to merged file" inputRowSetVariable="${line}" outputFile=" /tmp/merged.txt" whenFileExists="append" includeHeadings="false" recordDelimiter="LF" version="1.0" disabled="true" />


               <exec label="bash: echo to merged file" executable="/bin/bash" version="1.0">
                  <arg value="-c" />
                  <arg value="/bin/echo &apos;lineno ${lineno}, ${line[1]}&apos; &gt;&gt;  /tmp/merged.txt" />
               </exec>

            </if>
         </forEachLoop>
      </forLoop>
   </module>

</project>
asmund
 
Posts: 1
Joined: Mon Mar 24, 2014 8:25 am

Re: Can't write a single row from a RowSet to file

Postby Support_Jon » Thu Apr 10, 2014 5:29 pm

asmund,

By design, the Write Fixed Width task consumes the entire rowset (as passed in) and iterates through it to write to the specified output file. If you want to write out the rows of a rowset one by one, then you should use our Print Task instead. Just specify the output file and you should see a nice improvement in your performance.

We have a new task that is coming out with our next release of GoAnywhere Director that I think you will like. It is called ModifyRowset and provides a means to modify an existing rowset and perform data translation, manipulation, and filtering. With this task you will be able to easily exclude the rows you don't want, and just process the ones you want be included. Our GoAnywhere Director 4.6.0 release that will include this new feature is slated to be released in May of this year.

Thanks - Jon
Support_Jon
Support Specialist
 
Posts: 50
Joined: Thu Jul 19, 2012 9:15 am
Location: Ashland, NE

Re: Can't write a single row from a RowSet to file

Postby monahanks » Tue Sep 30, 2014 1:41 pm

Hi Jon,
I need to process a rowset one record at a time. I have added the Print task, but all I got in my output was
"com.linoma.dpa.tasks.converters.flatfile.FlatFileRowSet@77095432"
My xml looks like this:

Code: Select all
<project name="Build_Control_xml_new" mainModule="Main" version="2.0">
   <variable name="File_In" value="DXLG_DEMO_20140930.txt" description="File name passed in from calling project " />
   <variable name="defaultfront" value="casua2_ibe" />
   <variable name="defaultback" value="_in_.ctl" />
   <variable name="Control_Name" value="" />

   <module name="Main">

      <createWorkspace version="1.0" />

      <if label="If_filename_DEMO" condition="${Substring(File_In, 6,4) eq &apos;DEMO&apos;}">

         <setVariable label="Set control file name" name="Control_Name" value="${defaultfront}M${defaultback}" version="2.0" />

      </if>
      <if label="If_filename_FULL" condition="${Substring(File_In, 6,4) eq &apos;FULL&apos;}">

         <setVariable label="Set control file name" name="Control_Name" value="${defaultfront}Y${defaultback}" version="2.0" />

      </if>
      <if label="If_filename_NCOA" condition="${Substring(File_In, 6,4) eq &apos;NCOA&apos;}">

         <setVariable label="Set control file name" name="Control_Name" value="${defaultfront}Q${defaultback}" version="2.0" />

      </if>
      <!--D:\GoAnywhere\userdata\projects\Acxiom_ctl.txt-->

      <readFlatFile label="Read file" inputFile="resource:smb://CMRG_fs_Shared/FTPDATA/Acxiom_ctl.txt" outputRowSetVariable="lineread" recordDelimiter="CR" version="1.0" logLevel="debug" disabled="false" />

      <print label="Write Control file" file="${system.job.workspace}\${Control_Name}" append="true" version="1.0">
         <![CDATA[${lineread}]]>
      </print>

      <writeFixedWidth label="Write Control File" inputRowSetVariable="${lineread}" outputFile="${system.job.workspace}\${Control_Name}" whenFileExists="append" includeHeadings="false" version="1.0" disabled="true" />
      <deleteWorkspace version="1.0" disabled="true" />
   </module>
</project>



The input file is a text document with 8 lines, and I want to be able to modify (append a variable into) a couple of the lines as I write the output file.
We are running version 4.6.1
monahanks
 
Posts: 27
Joined: Wed Mar 30, 2011 10:19 am

Re: Can't write a single row from a RowSet to file

Postby Support_Rick » Tue Sep 30, 2014 4:38 pm

Monahanks,

You need to put your OutputRowsetVariable (lineread) into a ForEach loop so that you can get access to the RowSet data by record. In pseudo terms...

Code: Select all
Main
  If...
  If...
  If...

  Read <MyFile> outputRowSetVariable="lineread"
  forEach itemsVariable="${lineread}" currentItemVariable="line"

    <!-- Process your Record here using ${line[1]} -->

    Print file="${Control_Name} append="true"
      <![CDATA[${line[1]}${system.carriageReturn}]]>
    /Print

  /forEach

  deleteWorkspace
/Main
Rick Elliott
Sr. Product Specialist
(402) 944.4242
(800) 949-4696
Support_Rick
Support Specialist
 
Posts: 215
Joined: Tue Jul 17, 2012 2:12 pm


Return to Community Forum

Who is online

Users browsing this forum: No registered users and 1 guest

cron