Message 82744 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	rrenaud
Recipients	barry, jdwhitley, rhettinger, rrenaud, skip.montanaro
Date	2009-02-26.07:38:31
SpamBayes Score	8.029311e-07
Marked as misclassified	No
Message-id	<1235633918.65.0.869504943067.issue1818@psf.upfronthosting.co.za>
In-reply-to

Content
I am totally new to Python dev. I reinvented a NamedTupleReader tonight, only to find out that it was created a year ago. My primary motivation is that DictReader reads headers nicely, but DictWriter totally sucks at handling them. Consider doing some filtering on a csv file, like so. sample_data = [ 'title,latitude,longitude', 'OHO Ofner & Hammecke Reinigungsgesellschaft mbH,48.128265,11.610848', 'Kitchen Kaboodle,45.544241,-122.715728', 'Walgreens,28.339727,-81.596367', 'Gurnigel Pass,46.731944,7.447778' ] def filter_with_dict_reader_writer(): accepted_rows = [] for row in csv.DictReader(sample_data): if float(row['latitude']) > 0.0 and float(row['longitude']) > 0.0: accepted_rows.append(row) field_names = csv.reader(sample_data).next() output_writer = csv.DictWriter(open('accepted_by_dict.csv', 'w'), field_names) output_writer.writerow(dict(zip(field_names, field_names))) output_writer.writerows(accepted_rows) You have to work so hard to maintain the headers when you write the file with DictWriter. I understand this is a limitation of dicts throwing away the order information. But namedtuples don't have that problem. NamedTupleReader and NamedTupleWriter should be inverses. This means that NamedTupleWriter needs to write headers. This should produce identical output as the dict writer example, but it's much cleaner. def filter_with_named_tuple_reader_writer(): accepted_rows = [] for row in csv.NamedTupleReader(sample_data): if float(row.latitude) > 0.0 and float(row.longitude) > 0.0: accepted_rows.append(row) output_writer = csv.NamedTupleWriter( open('accepted_by_named_tuple.csv', 'w')) output_writer.writerows(accepted_rows) I patched on top of the existing NamedTupleWriter patch adding support for writing headers. I don't know if that's bad style/etiquette, etc.

I am totally new to Python dev.  I reinvented a NamedTupleReader
tonight, only to find out that it was created a year ago.  My primary
motivation is that DictReader reads headers nicely, but DictWriter
totally sucks at handling them.

Consider doing some filtering on a csv file, like so.

sample_data = [
    'title,latitude,longitude',
    'OHO Ofner & Hammecke Reinigungsgesellschaft mbH,48.128265,11.610848',
    'Kitchen Kaboodle,45.544241,-122.715728',
    'Walgreens,28.339727,-81.596367',
    'Gurnigel Pass,46.731944,7.447778'
    ]

def filter_with_dict_reader_writer():
  accepted_rows = []
  for row in csv.DictReader(sample_data):
    if float(row['latitude']) > 0.0 and float(row['longitude']) > 0.0:
      accepted_rows.append(row)

  field_names = csv.reader(sample_data).next()
  output_writer = csv.DictWriter(open('accepted_by_dict.csv', 'w'),
                                 field_names)
  output_writer.writerow(dict(zip(field_names, field_names)))
  output_writer.writerows(accepted_rows)

You have to work so hard to maintain the headers when you write the file
with DictWriter.  I understand this is a limitation of dicts throwing
away the order information.  But namedtuples don't have that problem.

NamedTupleReader and NamedTupleWriter should be inverses.  This means
that NamedTupleWriter needs to write headers.  This should produce
identical output as the dict writer example, but it's much cleaner.

def filter_with_named_tuple_reader_writer():
   accepted_rows = []
   for row in csv.NamedTupleReader(sample_data):
     if float(row.latitude) > 0.0 and float(row.longitude) > 0.0:
       accepted_rows.append(row)

   output_writer = csv.NamedTupleWriter(
       open('accepted_by_named_tuple.csv', 'w'))
   output_writer.writerows(accepted_rows)

I patched on top of the existing NamedTupleWriter patch adding support
for writing headers.  I don't know if that's bad style/etiquette, etc.

History
Date	User	Action	Args
2009-02-26 07:38:39	rrenaud	set	recipients: + rrenaud, skip.montanaro, barry, rhettinger, jdwhitley
2009-02-26 07:38:38	rrenaud	set	messageid: <1235633918.65.0.869504943067.issue1818@psf.upfronthosting.co.za>
2009-02-26 07:38:36	rrenaud	link	issue1818 messages
2009-02-26 07:38:35	rrenaud	create