Names with accents not handled properly

Issue No. 132

Type

Bug

Status

Closed

Reported By

web

Component

API

Resolution

Not a Bug

Votes

0

Created

7/Apr/13 9:00 PM EDT

Closed

17/Apr/2013 1:49 AM EDT

Description

Few quick examples: Adrián Beltré is handled as Adrián Béltre Ronny Cedeño is handled as Ronny Cedeño

Closing Comment

See comment below.

Comments

1. web 11/Apr/2013 at 12:48 PM EDT

Here's an update. Ostensibly Erik determined that the encoding was incorrect with the JSON object. I am using PHP for my app... if I wrap utf8_decode around the results, it encodes it properly. This is not a bug of xmlstats. -mg

2. Erik Berg 16/Apr/2013 at 4:00 PM EDT

There is some confusion about the correct mime type for JSON and whether it should include a charset declaration. The JSON RFC (http://www.ietf.org/rfc/rfc4627.txt) states "JSON text SHALL be encoded in Unicode. The default encoding is UTF-8," and specifies the mime-type as "application/json" with no required or optional parameters. That's what xmlstats uses for JSON results. When a charset is not explicitly set, some programs will fallback to a default encoding like ISO-8859-1 or Latin1. As the example cases show, the á and é were interpreted as ISO-8859-1/Latin1 and that's why two characters were printed.