Currently to add a set of contacts I need to do the following:
Parameter names and values are escaped. Space characters are replaced by `+', and then reserved characters are escaped as described in , section 2.2: Non-alphanumeric characters are replaced by `%HH', a percent sign and two hexadecimal digits representing the ASCII code of the character. Line breaks are represented as "CR LF" pairs (i.e., `%0D%0A').
So my question is "is there any possibility for using utf-8 somehow?".
A simple example, I need to add a contact with "Aljoša" as the first name. How do I do that? So far all my tries spoiled all non ansi symbols (Aljo?a).
Sorry for the problem you're running into. We are looking into this as UTF-8 characters should be able to be imported. This issue is not related to the escaped reserved characters you are mentioning above, that is standard for all Encoded Form HTTP requests and does not include UTF-8 characters. Once we have an update we will get back to you.
I didn't get your answer clearly first time, now after rereading it for some times I think you're wrong about your conclusion. There is nothing in standards you mentioned as such a limitation - big part of the internet works well with utf8 and with HTML forms as well. So, I still think it's all about the escaping we have to use.
Can you pass this question to a technical person please? This is really a bug we can't find a workaround for.
Sorry for the confusion on my post.
Regarding escaping: the only characters which you have to escape are the reserved set of characters in the RFC I referenced. You can escape any other content characters that you feel you need or want to, however URI control characters should never be escaped.
We are doing some additional testing regarding the following languages to see if there are any limitations due to the fact we store all data in an ISO-8859-1 database: Simplified and Transitional Chinese, Arabic, Russian and Portuguese. If you would like us to do any additional testing with other languages, we can include them as well. I expect there to be some problems with some of these languages as they contain characters which are outside of the ISO-8859-1 character set, however until we have our testing completed I do not want to guess as to what the exact problems are.
We will update this thread with the results of our testing and possibly put up a sticky as well depending on what we find out internally. It is worth noting, the only languages Constant Contact officialy supports are listed in this FAQ. We are working on expanding this list in the future. I am unable to comment on road map time lines for this initiative at this time.
Now I get it, thanks Dave!
I really missed that FAQ somehow (though really was looking for something like this). There is just no workaround then.
You mentioned "Simplified and Transitional Chinese, Arabic, Russian and Portuguese", not that it helps a lot, but I'm curious - what that means? Using some approach you also support these languages using ISO-8859-1?
I tried a couple of Russian names through the web interface, not the API (i'm russian myself) one worked, i.e. contact was successfully added to the list, but the second failed and it always does. Probably it's a different issue though, the contact's data I was trying to add was
lastname: Стешенко (my real name occasionally).
CC fails and asks to try to re-login.
Thank you for sharing this.
Every application we've developed for the last ten years uses a UTF8 database. I cannot imagine the issues your team must be trying to deal with now (figuring out what characters are represented in ISO-8859-1). To me it's inconceivable to use anything else (it's not hard nor expensive). It's 2010, the world is flat, users are everywhere and speak every language.
The API's XML examples mention UTF8 support. I suppose this is for encoding during transmission only?
As as solution partner who has integrated the service this creates a significant issue for our clients and fuel for those that want us to implement a different one (check our forums). I hope your team(s) can figure out the tactics to overcome these issues, or change your database store's charset, without much delay.
I'd recommend putting a focus on this one -- it's fundamental. I don't think clients should be wrong in assuming a business web service supports international characters.
We are currently working on testing a variety of languages to see what conversions our internal database code will do on submissions. Currently our tests show that of the languages I mentioned above, Simplified/Traditional Chinese, Russian and Arabic are not supported through the API. Most of the symbols are simply disregarded, some are incorrectly changed to ? characters. This is something we are going to work on documenting and making sure that this is fully disclosed to all API developers.
I can certainly understand your feelings regarding UTF-8. We at Constant Contact agree with your position that a character set that supports all languages is the way to go forward with our applications. Unfortunately, our initial product was only used in English and our initial set up was an ISO-8859-1 database. Changing our application completely over to UTF-8 is a large undertaking. Fundamental to that undertaking is to cause no negative impact to any existing customer data. To this end, we are working very hard to ensure all existing data remains intact and that our entire database and software platform is updated.
I am unable to provide any time tables for this process. We are actively working on this and are trying to complete this project quickly and efficiently. We do hope you can understand that this is a very large project for us considering our size and the amount of data we have to convert. As we have more information regarding this, we will continue to update the forums with that information. You can be assured, once the project is completed we will make sure to announce it in our Developer Newsletter as well.
Thanks for the transparent and helpful comments Dave. I understand where CC is, how it got there and where it needs to go and appreciate the insight.
Reading that it is seen as an important issue and that the team there is working with a sense of urgency is encouraging. Although the current approach may mitigate some of the reported issues it seems fraught with compromises i.e. it must be expensive and frustrating to work within the current constraints. Isn't it reasonable to just deal with it without delay i.e. pay the toll now and focus on the future? It's a price all service providers must pay. The stakes just get higher each day; I don't think there’s a shortcut.