PHP Cookbook: Expanding and Compressing Tabs

Problem
You want to change spaces to tabs (or tabs to spaces) in a string while keeping text aligned with tab stops. For example, you want to display formatted text to users in a standardized way.

Solution
Use str_replace( ) to switch spaces to tabs or tabs to spaces, as shown in Example
1-22.
Example 1-22. Switching tabs and spaces
[sourcecode language=”php”]<?php
$r = mysql_query("SELECT message FROM messages WHERE id = 1") or die();
$ob = mysql_fetch_object($r);
$tabbed = str_replace(‘ ‘,"\t",$ob->message);
$spaced = str_replace("\t",’ ‘,$ob->message);
print "With Tabs: <pre>$tabbed</pre>";
print "With Spaces: <pre>$spaced</pre>";
?>[/sourcecode]

Using str_replace( ) for conversion, however, doesn’t respect tab stops. If you want tab stops every eight characters, a line beginning with a five-letter word and a tab should have that tab replaced with three spaces, not one. Use the pc_tab_expand( ) function shown in Example 1-23 into turn tabs to spaces in a way that respects tab stops.

Example 1-23. pc_tab_expand( )
[sourcecode language=”php”]<?php
function pc_tab_expand($text) {
while (strstr($text,"\t")) {
$text = preg_replace_callback(‘/^([^\t\n]*)(\t+)/m’,’pc_tab_expand_helper’, $text);
}
return $text;
}
function pc_tab_expand_helper($matches) {
$tab_stop = 8;
return $matches[1] .
str_repeat(‘ ‘,strlen($matches[2]) *
$tab_stop – (strlen($matches[1]) % $tab_stop));
}
$spaced = pc_tab_expand($ob->message);
?>
[/sourcecode]

You can use the pc_tab_unexpand( ) function shown in Example 1-24 to turn spaces back to tabs.

Example 1-24. pc_tab_unexpand( )
[sourcecode language=”php”]
<?php
function pc_tab_unexpand($text) {
$tab_stop = 8;
$lines = explode("\n",$text);
foreach ($lines as $i => $line) {
// Expand any tabs to spaces
$line = pc_tab_expand($line);
$chunks = str_split($line, $tab_stop);
$chunkCount = count($chunks);
// Scan all but the last chunk
for ($j = 0; $j < $chunkCount – 1; $j++) {
$chunks[$j] = preg_replace(‘/ {2,}$/’,"\t",$chunks[$j]);
}
// If the last chunk is a tab-stop’s worth of spaces
// convert it to a tab; Otherwise, leave it alone
if ($chunks[$chunkCount-1] == str_repeat(‘ ‘, $tab_stop)) {
$chunks[$chunkCount-1] = "\t";
}
// Recombine the chunks
$lines[$i] = implode(”,$chunks);
}
// Recombine the lines
return implode("\n",$lines);
}
$tabbed = pc_tab_unexpand($ob->message);
?>
[/sourcecode]

Both functions take a string as an argument and return the string appropriately modified.

Discussion
Each function assumes tab stops are every eight spaces, but that can be modified by changing the setting of the $tab_stop variable.

The regular expression in pc_tab_expand( ) matches both a group of tabs and all the text in a line before that group of tabs. It needs to match the text before the tabs because the length of that text affects how many spaces the tabs should be replaced with so that
subsequent text is aligned with the next tab stop. The function doesn’t just replace each tab with eight spaces; it adjusts text after tabs to line up with tab stops.

Similarly, pc_tab_unexpand( ) doesn’t just look for eight consecutive spaces and then replace them with one tab character. It divides up each line into eight-character chunks and then substitutes ending whitespace in those chunks (at least two spaces) with tabs. This not only preserves text alignment with tab stops; it also saves space in the string.

See Also
Documentation on str_replace( ) at http://www.php.net/str-replace, on preg_replace_callback( ) at http://www.php.net/preg_replace_callback, and on str_split( ) at http://www.php.net/str_split. Recipe 22.10 has more information on
preg_replace_callback( ) .