Epicserve

Convert Microsoft Word Smart Quotes to Straight Quotes

December 13, 2004 | 1:38 p.m. CST

If you use Moveable Type or some another content management system for your blog or corporate website you will probably run into a situation where you have users who will type up there articles to submit to the website using Microsoft Word. The problem with users who use Microsoft Word and then copy and paste there article into your CMS is that Word converts straight quotes into smart quotes. And the problem with using actual smart quote charters is that your browser might not be able to properly render the curly quote based on the character set and encode type your servers is using.

Lucky I found a function on php.net that you can use to convert smart quotes into regular quotes so that you can store the article in your database using regular quotes.

function cleanupSmartQuotes($text) { $badwordchars = array(chr(145),chr(146),chr(147), chr(148),
chr(150), chr(151)); $fixedwordchars = array("'","'",'"','"','-','--'); return str_replace($badwordchars,$fixedwordchars,$text); } //usage $article = cleanupSmartQuotes($_POST['article']);

There are some problems that I ran into when trying to get this to work on my server that you should be aware of. For some reason it seemed to be working when I tried to convert text in a local variable but it wasn't working when I tried to convert text posted from a form. I spent some time trying to figure it out when I finally had to ask some friends for some help. They ran my code on their server to discover the code worked just fine for them. This lead me to believe that it had something to do with the encode type on my server. After messing with the encode type settings in /etc/sysconfig/i18n and in /etc/httpd/conf/httpd.conf I finally found that the encode type setting that was causing the problem was in the php.ini file. After changing default_charset = "utf-8" to default_charset = "iso-8859-1" it started working.

Now that this works I can continue to let people using word as their program for writing articles and not have to explain to them why smart quotes in word don't work in the CMS and how to turn them off in word.

And for all your people out there that like smart quotes you when reading an article you can use SmartyPants-PHP to change straight quotes into smart quotes.

Related tags: Microsoft, Quotes, Smart, Word

Comments

Jeff Croft
1.   At 2:35 p.m. CST on Dec. 13, 2004, Jeff Croft wrote:

Markdown and SmartyPants take care of all my needs in this area. SmartyPants covers most of the requisite typographic goodness, and Markdown allows non-savvy users to add other HTML elements. Jon Gruber's projects are really great...

mindtwist
2.   At 10:10 a.m. CST on Jan. 3, 2005, mindtwist wrote:

Thanks! :)

jiml
3.   At 8:01 p.m. CDT on Aug. 18, 2005, jiml wrote:

Thanks so much!

Paulg
4.   At 7:05 a.m. CDT on Aug. 23, 2005, Paulg wrote:

Awesome.

Lost my lunchtime hunging this down.

You're a star.

Eric
5.   At 3:26 p.m. CDT on Sept. 27, 2005, Eric wrote:

I thought - Why not do a quick Google search to see if someone else has already solved this problem? Good thing I did. You just saved me a couple hours of work with a simple 3 line function and a php.ini edit!

chas
6.   At 12:11 a.m. CST on March 27, 2006, chas wrote:

another found on php.net

tim_meredith at s4s dot org

--Convert MSWord Quotes-- Use this before any conversion to HTML entities or characters to clean up a form entry cut and pasted from MSWord.

function fixword($scratch) { $start=chr(226).chr(128); $word=array();$fixword=array(); $word[]=$start.chr(152);$fixword[]="'"; $word[]=$start.chr(153);$fixword[]="'"; $word[]=$start.chr(156);$fixword[]="""; $word[]=$start.chr(157);$fixword[]="""; return str_replace($word, $fixword, $scratch); }

Brent O'Connor
7.   At 8:19 a.m. CST on March 27, 2006, Brent O'Connor wrote:

chas,

The only thing that is different about that function is that adds Extended ASCII characters 226 and 128 to every bad word character. And by adding those characters I'm not sure how it would make the function any better or evey if it would work.

Andrew Kleimeyer
8.   At 11:27 p.m. CDT on Oct. 18, 2006, Andrew Kleimeyer wrote:

I've found the the original function is inadequate and the function "chas" added in the comments is better.

When pasted into a text box, microsoft smart quotes show up as three characters, 226, 128, and a third character which identifies left double, left single, right double, right single.

Your replace function needs to remove the first two characters and substitute the straight quote for the third.

Note: my text is being pasted from Word 2003. Earlier versions of word may have used different characters.

Brent O'Connor
9.   At 11:55 a.m. CDT on Oct. 20, 2006, Brent O'Connor wrote:

Andrew,

My function works for me, even on the newest version of Word.

Comments are closed.

Comments have been close for this post.