<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress.com" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>

<channel>
	<title>regular-expressions &amp;laquo; WordPress.com Tag Feed</title>
	<link>http://en.wordpress.com/tag/regular-expressions/</link>
	<description>Feed of posts on WordPress.com tagged "regular-expressions"</description>
	<pubDate>Sat, 28 Nov 2009 03:22:33 +0000</pubDate>

	<generator>http://en.wordpress.com/tags/</generator>
	<language>en</language>

<item>
<title><![CDATA[Регулярные выражения - Отрицательные группы и якоря]]></title>
<link>http://xaegr.wordpress.com/2009/11/26/regexp-2-negatives-and-anchors/</link>
<pubDate>Thu, 26 Nov 2009 13:11:46 +0000</pubDate>
<dc:creator>Xaegr</dc:creator>
<guid>http://xaegr.wordpress.com/2009/11/26/regexp-2-negatives-and-anchors/</guid>
<description><![CDATA[Продолжаем разговор о регулярных выражениях. В предыдущем посте я рассказал об основах, а в этом рас]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p><img style="border-bottom:0;border-left:0;display:inline;border-top:0;border-right:0;margin:0 10px 10px 0;" title="regexp-2" border="0" alt="regexp-2" align="left" src="http://xaegr.files.wordpress.com/2009/11/regexp2.jpg?w=86&#038;h=119" width="86" height="119" />Продолжаем разговор о регулярных выражениях. В <a href="http://xaegr.wordpress.com/2009/11/20/regexp-1-intro/">предыдущем посте</a> я рассказал об основах, а в этом рассмотрим некоторые более “продвинутые” конструкции регулярных выражений.</p>
<p>Предполагается что вы уже знаете как указать регулярному выр ажению какие символы и/или их последовательности должны быть в строке для совпадения. А что если вам нужно указать не те символы которые должны присутствовать, а те которых не должно быть? То есть если вам нужно вывести лишь согласные буквы, вы можете конечно их перечислить, а можете использовать и отрицательную группу с гласными, например:</p>
<p> <!--more-->
</p>
<pre>PS C:\&#62; &#34;a&#34;,&#34;b&#34;,&#34;c&#34;,&#34;d&#34;,&#34;e&#34;,&#34;f&#34;,&#34;g&#34;,&#34;h&#34; -match &#34;[^aoueyi]&#34;
b
c
d
f
g
h</pre>
<p>&#34;Крышка&#34; в качестве первого символа группы символов означает именно отрицание. То есть на месте группы может присутствовать любой символ кроме перечисленных в ней. Для того чтобы включить отрицание в символьных группах (<code>\d</code>, <code>\w</code>, <code>\s</code>), не обязательно заключать их в квадратные скобки, достаточно&#8230; перевести их в верхний регистр <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  Например <code>\D</code> будет означать &#34;что угодно, кроме цифр&#34;, а <code>\S</code> &#34;всё кроме пробелов&#34;</p>
<pre>PS C:\&#62; &#34;a&#34;,&#34;b&#34;,&#34;1&#34;,&#34;c&#34;,&#34;45&#34; -match &#34;\D&#34;
a
b
c
PS C:\&#62; &#34;a&#34;,&#34;-&#34;,&#34;*&#34;,&#34;c&#34;,&#34;&#38;&#34; -match &#34;\W&#34;
-
*
&#38;</pre>
<p>Уже гораздо могущественнее обычных символов подстановки, не так ли? <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  А ведь мы только начали изучать основы! Символьные группы позволяют нам указать лишь содержимое одной позиции, один символ находящийся в неопределенном месте строки. А что если нам надо например выбрать все слова которые начинаются с буквы <code>w</code>? если просто поместить эту букву в регулярное выражение, то оно совпадёт для всех строк где <code>w</code> вообще встречается, и не важно &#8211; в начале, в середине или в конце строки. В таких случаях на помощь приходят &#34;якоря&#34;. Они позволяют производить сравнение начиная с определенной позиции в строке. <code>^<code> (крышка) является якорем начала строки, а <code>$</code> (знак доллара) - обозначает конец строки. Не запутайтесь - <code>^</code> как символ отрицания используется лишь в начале группы символов, а вне группы - этот символ является уже якорем <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  Да, да, авторам регулярных выражений явно не хватало специальных символов, и они по возможности, использовали их более чем в одном месте (о втором значении <code>$</code> поговорим позже) <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  Впрочем лучше посмотреть на примере:</code> </code></p>
<pre>PS C:\&#62; Get-Process &#124; where {$_.name -match &#34;^w&#34;}

Handles  NPM(K)    PM(K)      WS(K) VM(M)   CPU(s)     Id ProcessName
-------  ------    -----      ----- -----   ------     -- -----------
     80      10     1460        156    47     0,11    452 wininit
    114       9     2732       1428    55     0,56   3508 winlogon
    162      11     3660       1652    44     0,14   3620 wisptis
    225      20     5076       4308    95    31,33   3800 wisptis
    469      28     9572      11904   101     3,23   1844 wlcrasvc
    706      54    52452      43008   632     9,64   1072 wmdc
    105      10     2308       1428    76     0,08   4056 wuauclt</pre>
<p>Эта команда вывела процессы у которых сразу после начала имени (<code>^</code>) следует символ <code>w</code>. Иначе говоря имя начинается на <code>w<code>. Давайте для усложнения примера, и для упрощения понимания, добавим сюда “крышку” в значении отрицательной группы:</code> </code></p>
<pre>PS C:\&#62; Get-Process &#124; where {$_.name -match &#34;^w[^l-z]&#34;}

Handles  NPM(K)    PM(K)      WS(K) VM(M)   CPU(s)     Id ProcessName
-------  ------    -----      ----- -----   ------     -- -----------
     80      10     1460        156    47     0,11    452 wininit
    114       9     2732       1428    55     0,56   3508 winlogon
    162      11     3660       1652    44     0,14   3620 wisptis
    225      20     5076       4308    95    31,50   3800 wisptis</pre>
<p>Теперь команда вывела нам процессы у которых имя начинается с символа <code>w</code>, а следующий символ является чем угодно, только не из диапазона <code>l-z</code>.</p>
<p>Обратите внимание, примеры уже начинают походить на краказябли, а вы их уже можете понимать <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /><br />
  <br />Ну и для закрепления, опробуем второй якорь &#8211; конец строки:</p>
<pre>PS C:\&#62; &#34;Яблоки&#34;,&#34;Груши&#34;,&#34;Дыня&#34;,&#34;Енот&#34;,&#34;Апельсины&#34;,&#34;Персик&#34; -match &#34;[ыи]$&#34;
Яблоки
Груши
Апельсины</pre>
<p>Это выражение вывело нам все слова в которых последняя буква <code>И</code> или <code>Ы</code>. </p>
<p>Если вы можете точно описать содержимое всей строки, то вы можете использовать и оба якоря одновременно:</p>
<pre>PS C:\&#62; &#34;abc&#34;,&#34;adc&#34;,&#34;aef&#34;,&#34;bca&#34;,&#34;aeb&#34;,&#34;abec&#34;,&#34;abce&#34; -match &#34;^a.[cb]$&#34;
abc
adc
aeb</pre>
<p>Это регулярное выражение выводит все строки которые начинаются с буквы А, за которой следует один любой символ (точка), затем символ C или B и затем конец строки.</p>
<p>Продолжение следует… <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[Регулярные выражения - Введение]]></title>
<link>http://xaegr.wordpress.com/2009/11/20/regexp-1-intro/</link>
<pubDate>Fri, 20 Nov 2009 19:03:18 +0000</pubDate>
<dc:creator>Xaegr</dc:creator>
<guid>http://xaegr.wordpress.com/2009/11/20/regexp-1-intro/</guid>
<description><![CDATA[Мне достаточно часто задают вопросы связанные не столько с самим PowerShell, сколько с применением в]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p><img style="display:inline;border-width:0;margin:0 10px 5px 0;" title="regexp-0" border="0" alt="regexp-0" align="left" src="http://xaegr.files.wordpress.com/2009/11/regexp0.png?w=215&#038;h=315" width="215" height="315" /> Мне достаточно часто задают вопросы связанные не столько с самим PowerShell, сколько с применением в нем регулярных выражений. Это и понятно &#8211; регулярные выражения (или если сокращенно “регэкспы” (regexp, regular expressions)) обладают огромной мощью, и способны сильно упростить жизнь системного администратора или программиста. Однако в мире системного администрирования Windows они мало известны и непопулярны &#8211; в cmd.exe практически единственная возможность их применения это утилита findstr.exe, которая обладает очень маленьким функционалом и использует жутко урезанный диалект регулярных выражений. В VBScript функционал регулярных выражений тоже хорошо запрятан, и практически не используется. А вот в PowerShell, авторы языка позаботились о том чтобы регулярные выражения были легко доступны, удобны в использовании и максимально функциональны. Тем более что с последним пунктом всё оказалось достаточно просто &#8211; PowerShell использует реализацию регулярных выражений .NET, а она является одной из самых функциональных и производительных, и даже способна потягаться даже с признанным лидером в этой области &#8211; perl&#8217;ом <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Итак, перейдем к делу. Что же такое регулярные выражения? Я не помню правильных и сухих определений из умных книжек, да и незачем, кому интересно &#8211; прочитает их сам <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  Регулярные выражения это специальный мини-язык служащий для разбора (parsing) текстовых данных. С его помощью можно разделять строки на компоненты, выбирать нужные части строк для дальнейшей обработки, производить замены, и всё это с огромной гибкостью и точностью.</p>
</p>
<p> <!--more-->
</p>
<p>Впрочем&#8230; знакомство с регулярными выражениями лучше начать не с них, а с более простой технологии служащей подобным целям, с которой знакомы все Windows администраторы – с подстановочных символов. Наверняка вы не раз выполняли команду dir, и указывали ей в качестве аргумента маску файла, например *.exe. В данном случае звёздочка означает “любое количество любых символов”. Аналогично можно использовать и знак вопроса, он будет означать “один любой символ”, то есть <code>dir ??.exe</code> выведет все файлы с расширением .exe и именем из двух символов. В PowerShell&#8217;овской реализации подстановочных символов можно применять и еще одну конструкцию &#8211; группы символов. Так например <code>[a-f]</code> будет означать “один любой символ от <code>a</code> до <code>f</code> (a,b,c,d,e,f)”, а <code>[smw]</code> любую из трех букв (<code>s</code>, <code>m</code> или <code>w</code>). Таким образом команда <code>get-childitem [smw]??.exe</code> выведет файлы с расширением <code>.exe</code>, у которых имя состоит из трех букв, и первая буква либо <code>s</code>, либо <code>m</code>, либо <code>w</code>. Неплохо, неправда ли? Так вот, по сравнению с возможностями регулярных выражения &#8211; это детский лепет <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  Но начнём с малого <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Для начала изучения мы будем использовать оператор PowerShell <code>-match</code>, который позволяет сравнивать текст слева от него, с регулярным выражением справа. В случае если текст подпадает под регулярное выражение, оператор выдаёт <code>True</code>, иначе &#8211; <code>False</code>.</p>
<pre>PS C:\&#62; &#34;PowerShell&#34; -match &#34;Power&#34;
True</pre>
<p>Вы наверное обратили внимание, что при сравнении с регулярным выражением ищется лишь вхождение строки, полное совпадение текста необязательно (разумеется это можно изменить, но об этом позже). То есть достаточно чтобы регулярное выражение встречалось в тексте.</p>
<pre>PS C:\&#62; &#34;Shell&#34; -match &#34;Power&#34;
False
PS C:\&#62; &#34;PowerShell&#34; -match &#34;rsh&#34;
True</pre>
<p>Еще одна тонкость: оператор <code>-match</code> по умолчанию не чувствителен к регистру символов (как и другие текстовые операторы в PowerShell), если же вам нужна чувствительность к регистру, используйте <code>-cmatch</code>:</p>
<pre>PS C:\&#62; &#34;PowerShell&#34; -cmatch &#34;rsh&#34;
False</pre>
<p>В регулярных выражениях можно использовать и группы символов:</p>
<pre>PS C:\&#62; Get-Process &#124; where {$_.name -match &#34;sy[ns]&#34;}

Handles  NPM(K)    PM(K)      WS(K) VM(M)   CPU(s)     Id ProcessName
-------  ------    -----      ----- -----   ------     -- -----------
    165      11     2524       8140    79     0,30   5228 mobsync
    114      10     3436       3028    83    50,14   3404 SynTPEnh
    149      11     2356        492    93     0,06   1592 SynTPStart
    810       0      116        380     6               4 System</pre>
<p>И диапазоны в этих группах:</p>
<pre>PS C:\&#62; &#34;яблоко&#34;,&#34;апельсин&#34;,&#34;груша&#34;,&#34;абрикос&#34; -match &#34;а[а-п]&#34;
апельсин
абрикос</pre>
<p>Кстати тут я в левой части оператора <code>-match</code> поместил массив строк, и он соответственно вывел лишь те строки, которые подошли под регулярное выражение.</p>
<p>Разумеется перечисления символов можно комбинировать, например группа <code>[агдэ-я]</code> будет означать “А или Г или Д или любой символ от Э до Я включительно”. Но гораздо интереснее использовать диапазоны для определения целых классов символов. Например <code>[а-я]</code> будет означать любую букву русского алфавита, а <code>[a-z]</code> английского. Аналогично можно поступать с цифрами &#8211; следующая команда выведет все процессы в именах которых встречаются цифры:</p>
<pre>PS C:\&#62; Get-Process &#124; where {$_.name -match &#34;[0-9]&#34;}

Handles  NPM(K)    PM(K)      WS(K) VM(M)   CPU(s)     Id ProcessName
-------  ------    -----      ----- -----   ------     -- -----------
     93      10     1788       2336    70     1,25    548 FlashUtil10c
    158      12     6500       1024    96     0,14   3336 smax4pnp
     30       6      764        160    41     0,02   3920 TabTip32</pre>
<p>Так как эта группа используется достаточно часто, для неё была выделена специальная последовательность &#8211; <code>\d</code> (от слова digit). По смыслу она полностью идентична <code>[0-9]</code>, но гораздо короче <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<pre>PS C:\&#62; Get-Process &#124; where {$_.name -match &#34;\d&#34;}

Handles  NPM(K)    PM(K)      WS(K) VM(M)   CPU(s)     Id ProcessName
-------  ------    -----      ----- -----   ------     -- -----------
     93      10     1788       2336    70     1,25    548 FlashUtil10c
    158      12     6500       1024    96     0,14   3336 smax4pnp
     30       6      764        160    41     0,02   3920 TabTip32</pre>
<p>Так же последовательность была выделена для группы “любые буквы любого алфавита, любые цифры, или символ подчеркивания” эта группа обозначается как <code>\w</code> (от word) она примерно эквивалентна конструкции <code>[a-zа-я_0-9]</code> (в <code>\w</code> еще входят символы других алфавитов которые используются для написания слов).</p>
<p>Еще вам наверняка встретится другая популярная группа: <code>\s</code> &#8211; “пробел, или другой пробельный символ” (например символ табуляции). Сокращение от слова space. В большинстве случаев вы можете обозначать пробел просто как пробел <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  но эта конструкция добавляет читабельности регулярному выражению.</p>
<p>Не менее популярной группой можно назвать символ <code>.</code> (точка). Точка в регулярных выражениях аналогична по смыслу знаку вопроса в подстановочных символах, то есть обозначает один любой символ.</p>
<p>Все вышеперечисленные конструкции можно использовать как отдельно, так и в составе групп, например <code>[\s\d]</code> будет соответствовать любой цифре или пробелу. Если вы хотите указать внутри группы символ <code>-</code> (тире/минус) то надо либо экранировать его символом <code>\</code> (обратный слеш), либо поставить его в начале группы, чтобы он не был случайно истолкован как диапазон:</p>
<pre>PS C:\&#62; &#34;?????&#34;,&#34;Word&#34;,&#34;123&#34;,&#34;-&#34; -match &#34;[-\d]&#34;
123
-</pre>
<p>Продолжение: <a href="http://xaegr.wordpress.com/2009/11/26/regexp-2-negatives-and-anchors/">Отрицательные группы и якоря</a>.</p>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[GREP in InDesign tutorial released]]></title>
<link>http://macproductionartist.wordpress.com/2009/11/20/grep-in-indesign-tutorial-released/</link>
<pubDate>Fri, 20 Nov 2009 17:14:42 +0000</pubDate>
<dc:creator>paeon</dc:creator>
<guid>http://macproductionartist.wordpress.com/2009/11/20/grep-in-indesign-tutorial-released/</guid>
<description><![CDATA[Just what we&#8217;ve been waiting for! If you are shy about using regular expressions for searching]]></description>
<content:encoded><![CDATA[Just what we&#8217;ve been waiting for! If you are shy about using regular expressions for searching]]></content:encoded>
</item>
<item>
<title><![CDATA[Useful RegEx Pattern link]]></title>
<link>http://shaymol.wordpress.com/2009/11/18/useful-regex-pattern-link/</link>
<pubDate>Tue, 17 Nov 2009 18:58:51 +0000</pubDate>
<dc:creator>shaymol</dc:creator>
<guid>http://shaymol.wordpress.com/2009/11/18/useful-regex-pattern-link/</guid>
<description><![CDATA[Regular expression is very important in any programming . Sometimes we cannot think correctly for so]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p>Regular expression is very important in any programming . Sometimes we cannot think correctly for some common regular expressions. Here is a link where i found very handy regular expression set.</p>
<p><a href="http://www.roscripts.com/PHP_regular_expressions_examples-136.html">Regular Expressions Patterns and Examples</a></p>
<p>This will be very helpful i guess.</p>
<p>Cheer$</p>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[Renaming files using Perl command rename]]></title>
<link>http://linuxindetails.wordpress.com/2009/11/16/renaming-files-using-perl-command-rename/</link>
<pubDate>Mon, 16 Nov 2009 12:19:02 +0000</pubDate>
<dc:creator>linuxindetails</dc:creator>
<guid>http://linuxindetails.wordpress.com/2009/11/16/renaming-files-using-perl-command-rename/</guid>
<description><![CDATA[I have been using this command for a while now and it helped very much.The command rename is availab]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p>I have been using this command for a while now and it helped very much.<br />The command <b>rename</b> is available in <b>perl package</b>. This is a core package brought by the Debian installer.<br />To check the availability of the command : <br /><b><br />fool@localhost:~$which rename<br />/usr/bin/rename</b></p>
<p>Having a closer look at the command, you will notice that it is a symbolic link to <b>/etc/alternatives/rename</b> :</p>
<p><b>fool@localhost:~$ls -l /usr/bin/rename<br />lrwxrwxrwx 1 root root 24 août&#160; 23&#160; 2008 /usr/bin/rename -&#62; /etc/alternatives/rename</b></p>
<p>To display the current configuration : <b></p>
<p>root@localhost:~#update-alternatives &#8211;display rename<br />rename &#8211; auto mode<br />&#160;link currently points to /usr/bin/prename<br />/usr/bin/prename &#8211; priority 60<br />&#160;slave rename.1.gz: /usr/share/man/man1/prename.1.gz<br />Current `best&#8217; version is /usr/bin/prename.</b></p>
<p>How to use this command?<br />Quite simple if you know how to use regular expressions.<br />Thanks to the option <b>-n</b>, you can test your regular expression without renaming your files effectively.<br />Let us have an example :</p>
<p><b>fool@localhost:~$touch fool_1.t fool_2.t</b></p>
<p>If you want to rename fool_1.t and fool_2.t into fool1.t and fool2.t :&#160; <br /><b>fool@localhost:~$rename -n &#8217;s/_//g&#8217; *.t<br />fool_1.t renamed as fool1.t<br />fool_2.t renamed as fool2.t</b></p>
<p>To rename the files permanently, remove the option &#8216;<b>-n</b>&#8216; : <br /><b>fool@localhost:~$rename &#8217;s/_//g&#8217; *.t</b></p>
<p>More information :<br /><b><br />man rename </b><br />&#160;</p>
<div class="zemanta-pixie"><img class="zemanta-pixie-img" alt="" src="http://img.zemanta.com/pixy.gif?x-id=7c9754ca-47f6-8a35-8892-55f6f8a3b260" /></div>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[Regular Expressions in C#- Advanced Language Elements]]></title>
<link>http://spolnik.wordpress.com/2009/11/15/regular-expressions-in-c-advanced-language-elements/</link>
<pubDate>Sun, 15 Nov 2009 20:34:23 +0000</pubDate>
<dc:creator>Jacek Spólnik</dc:creator>
<guid>http://spolnik.wordpress.com/2009/11/15/regular-expressions-in-c-advanced-language-elements/</guid>
<description><![CDATA[Regular Expression Advanced Language Elements Grouping Constructs ( ) - captures the matched substri]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><h2>Regular Expression Advanced Language Elements</h2>
<h3>Grouping Constructs</h3>
<ul>
<li><strong>(   ) </strong>- captures the matched substring (or noncapturing group). Captures using () are numbered automatically based on the order of the opening parenthesis, starting from one. The first capture, capture element number zero, is the text matched by the whole regular expression pattern</li>
<li><strong>(?&#60;<em>name&#62;</em></strong><strong> )<strong> </strong></strong>- captures the matched substring into a group name or number name. The string used for name must not contain any punctuation and it cannot begin with a number. You can use single quotes instead of angle brackets</li>
<li><strong>(?&#60;<em>name1-name2</em>&#62;</strong><strong> )<strong><strong> </strong></strong></strong>- balancing group definition. Deletes the definition of the previously defined group name2 and stores in group name1 the interval between the previously defined name2 group and the current group. If no group name2 is defined, the match backtracks. Because deleting the last definition of name2 reveals the previous definition of name2, this construct allows the stack of captures for group name2 to be used as a counter for keeping track of nested constructs such as parentheses. In this construct, name1 is optional. You can use single quotes instead of angle brackets</li>
<li><strong>(?:   )</strong> &#8211; noncapturing group</li>
<li><strong>(?imnsx-imnsx:   )</strong> &#8211; applies or disables the specified options within the subexpression. For example, (?i-s: ) turns on case insensitivity and disables single-line mode</li>
<li><strong>(?=   )</strong> &#8211; zero-width positive lookahead assertion. Continues match only if the subexpression matches at this position on the right. For example, \w+(?=\d) matches a word followed by a digit, without matching the digit. This construct does not backtrack</li>
<li><strong>(?!   )</strong> &#8211; zero-width negative lookahead assertion. Continues match only if the subexpression does not match at this position on the right. For example, \b(?!un)\w+\b matches words that do not begin with un</li>
<li><strong>(?&#60;=   )</strong><strong><strong> </strong></strong>-zero-width positive lookbehind assertion. Continues match only if the subexpression matches at this position on the left. For example, (?&#60;=19)99 matches instances of 99 that follow 19. This construct does not backtrack</li>
<li><strong>(?&#60;!&#8211;   )</strong> &#8211; zero-width negative lookbehind assertion. Continues match only if the subexpression does not match at the position on the left</li>
<li><strong>(?&#62;   )</strong> &#8211; nonbacktracking subexpression (also known as a &#8220;greedy&#8221; subexpression). The subexpression is fully matched once, and then does not participate piecemeal in backtracking (That is, the subexpression matches only strings that would be matched by the subexpression alone.)</li>
<p><em>Named captures are numbered sequentially, based on the left-to-right order of the opening parenthesis (like unnamed captures), but numbering of named captures starts after all unnamed captures have been counted.<br />
</em></ul>
<h3>Backreference Constructs</h3>
<ul>
<li><strong>\number</strong> &#8211; backreference. For example, (\w)\1 finds doubled word characters</li>
<li><strong>\k &#60;<em>name</em>&#62; </strong>- named backreference. For example, (?\w)\k finds doubled word characters. The expression (?&#60;43&#62;\w)\43 does the same. You can use single quotes instead of angle brackets; for example, \k&#8217;char&#8217;</li>
</ul>
<h3>Alternation Constructs</h3>
<ul>
<li><strong>&#124;</strong> &#8211; matches any one of the terms separated by the &#124; (vertical bar) character. The leftmost successful match wins</li>
<li><strong>(?(expression)yes&#124;no)  &#8211; </strong>matches the &#8220;yes&#8221; part if the expression matches at this point; otherwise, matches the &#8220;no&#8221; part. The &#8220;no&#8221; part can be omitted. The expression can be any valid subexpression, but it is turned into a zero-width assertion, so this syntax is equivalent to (?(?=expression)yes&#124;no). Note that if the expression is the name of a named group or a capturing group number, the alternation construct is interpreted as a capture test (described in the next row of this table). To avoid confusion in these cases, you can spell out the inside (?=expression) explicitly</li>
<li><strong>(?(name)yes&#124;no)</strong> &#8211; matches the &#8220;yes&#8221; part if the named capture string has a match; otherwise, matches the &#8220;no&#8221; part. The &#8220;no&#8221; part can be omitted. If the given name does not correspond to the name or number of a capturing group used in this expression, the alternation construct is interpreted as an expression test (described in the preceding row of this table)</li>
</ul>
<h3>Miscellaneous Constructs</h3>
<ul>
<li><strong>(?imnsx-imnsx)</strong> &#8211; sets or disables options such as case insensitivity to be turned on or off in the middle of a pattern. Option changes are effective until the end of the enclosing group. See also the information on the grouping construct (?imnsx-imnsx: ), which is a cleaner form</li>
<li><strong>(?# )</strong> &#8211; inline comment inserted within a regular expression. The comment terminates at the first closing parenthesis character</li>
<li><strong># [to end of line]</strong> &#8211; X-mode comment. The comment begins at an unescaped # and continues to the end of the line. (Note that the x option or the RegexOptions.IgnorePatternWhitespace enumerated option must be activated for this kind of comment to be recognized.)</li>
</ul>
<h2>Bibliography</h2>
<ul>
<li><a href="http://msdn.microsoft.com/en-us/library/az24scfc%28VS.71%29.aspx" target="_blank">MSDN</a></li>
</ul>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[Regular Expressions in C#- Language Elements]]></title>
<link>http://spolnik.wordpress.com/2009/11/15/regular-expressions-in-c-language-elements/</link>
<pubDate>Sun, 15 Nov 2009 20:12:45 +0000</pubDate>
<dc:creator>Jacek Spólnik</dc:creator>
<guid>http://spolnik.wordpress.com/2009/11/15/regular-expressions-in-c-language-elements/</guid>
<description><![CDATA[Regular Expression Language Elements Meta Characters . &#8211; matches any single character $ ]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><h2>Regular Expression Language Elements</h2>
<h3>Meta Characters</h3>
<ul>
<li><strong>.</strong> &#8211; matches any single character</li>
<li><strong>$</strong> &#8211; matches the end of a line</li>
<li><strong>^ </strong>- matches the beginning of a line</li>
<li><strong>*</strong> &#8211; matches zero or more occurrences of the character immediately preceding</li>
<li><strong>\ </strong>- this is escape or quoting character. The character after this is treated as an ordinary character</li>
<li><strong>[] </strong> &#8211; matches any one of the characters between the brackets</li>
<li><strong>[a1-a9] </strong>- ranges of characters can specified by using a hyphen</li>
<li><strong>[^a1-a9] </strong>- to match any character except those in the range, t</li>
<li><strong>() </strong>- treat the expressions between ( and ) as a group. Also, saves the characters matched by the expression into temporary holding areas. Up to nine pattern matches can be saved in a single regular expression. They can be referenced as 1 through 9</li>
<li><strong>&#124; </strong>- or two conditions together</li>
<li><strong>+ </strong>- matches one or more occurrences of the character or regular expression immediately preceding</li>
<li><strong>? </strong>- matches 0 or 1 occurrence of the character or regular expression immediately preceding</li>
<li><strong>{n}</strong> &#8211; specifies exactly n matches</li>
<li><strong>{n,}</strong> &#8211; specifies at least n matches</li>
<li><strong>{n,m}</strong> &#8211; specifies at least n, but no more than m, matches</li>
<li><strong>*? </strong>- specifies the first match that consumes as few repeats as possible (equivalent to lazy *)</li>
<li><strong>+? </strong>- specifies as few repeats as possible, but at least one (equivalent to lazy +)</li>
<li><strong>?? </strong>- specifies zero repeats if possible, or one (lazy ?)</li>
<li><strong>{n}?</strong> &#8211; equivalent to {n} (lazy {n})</li>
<li><strong>{n,}?</strong> &#8211; specifies as few repeats as possible, but at least n (lazy {n,})</li>
<li><strong>{n,m}?</strong> &#8211; specifies as few repeats as possible between n and m (lazy {n,m})</li>
</ul>
<h3>Character Escapes</h3>
<ul>
<li><strong>\a </strong>-  matches a bell (alarm) \u0007</li>
<li><strong>\b </strong>- matches a backspace \u0008 if in a [] character class; otherwise,  <strong>\b</strong> denotes a word boundary (between <strong>\w</strong> and <strong>\W</strong> characters) . In a replacement pattern, <strong>\b</strong> always denotes a backspace</li>
<li><strong>\t </strong>- matches a tab \u0009</li>
<li><strong>\r</strong> &#8211; matches a carriage return \u000D</li>
<li><strong>\v </strong>- matches a vertical tab \u000B</li>
<li><strong>\f</strong> &#8211; matches a form feed \u000C</li>
<li><strong>\n</strong> &#8211; matches a new line \u000A</li>
<li><strong>\e</strong> &#8211; matches an escape \u001B</li>
<li><strong>40</strong> &#8211; matches an ASCII character as octal (up to three digits); numbers with no leading zero are backreferences if they have only one digit or if they correspond to a capturing group number</li>
<li><strong>\x20</strong> &#8211; matches an ASCII character using hexadecimal representation (exactly two digits)</li>
<li><strong>\cC</strong> &#8211; matches an ASCII control character</li>
<li><strong>\u0020</strong> &#8211; matches a Unicode character using hexadecimal representation (exactly four digits)</li>
</ul>
<h3>Substitutions</h3>
<ul>
<li><strong>$number</strong> &#8211; substitutes the last substring matched by group number number (decimal)</li>
<li><strong>${name}</strong> &#8211; substitutes the last substring matched by a (? ) group</li>
<li><strong>$$</strong> &#8211; substitutes a single &#8220;$&#8221; literal</li>
<li><strong>$&#38;</strong> &#8211; substitutes a copy of the entire match itself</li>
<li><strong>$`</strong> &#8211; substitutes all the text of the input string before the match</li>
<li><strong>$&#8217;</strong> &#8211; substitutes all the text of the input string after the match</li>
<li><strong>$+ </strong>- substitutes the last group captured</li>
<li><strong>$_ </strong>- substitutes the entire input string</li>
</ul>
<h3>Character Classes</h3>
<ul>
<li><strong>\p{name}</strong> &#8211; matches any character in the named character class specified by {name}. Supported names are Unicode groups and block ranges</li>
<li><strong>\P{name}</strong> &#8211; matches text not included in groups and block ranges specified in {name}</li>
<li><strong>\w </strong>- matches any word character</li>
<li><strong>\W </strong>- matches any nonword character</li>
<li><strong>\s </strong>- matches any white-space character</li>
<li><strong>\S </strong>- matches any non-white-space character</li>
<li><strong>\d </strong>- matches any decimal digit</li>
<li><strong>\D </strong>- matches any nondigit</li>
</ul>
<h3>Atomic Zero-Width Assertions</h3>
<ul>
<li><strong>\A</strong> &#8211; specifies that the match must occur at the beginning of the string (ignores the Multiline option)</li>
<li><strong>\Z </strong>- specifies that the match must occur at the end of the string or before \n at the end of the string (ignores the Multiline option)</li>
<li><strong>\z </strong>- specifies that the match must occur at the end of the string (ignores the Multiline option)</li>
<li><strong>\G</strong> &#8211; specifies that the match must occur at the point where the previous match ended. When used with Match.NextMatch(), this ensures that matches are all contiguous</li>
<li><strong>\b </strong>- specifies that the match must occur on a boundary between \w (alphanumeric) and \W (nonalphanumeric) characters. The match must occur on word boundaries &#8211; that is, at the first or last characters in words separated by any nonalphanumeric characters</li>
<li><strong>\B </strong>- specifies that the match must not occur on a \b boundary</li>
</ul>
<h2>Bibliography</h2>
<ul>
<li><a class="wp-caption" href="http://msdn.microsoft.com/en-us/library/az24scfc%28VS.71%29.aspx" target="_blank">MSDN</a></li>
</ul>
<p><!-- 		@page { size: 21cm 29.7cm; margin: 2cm } 		P { margin-bottom: 0.21cm } --> <!-- 		@page { size: 21cm 29.7cm; margin: 2cm } 		P { margin-bottom: 0.21cm } --> <!-- 		@page { size: 21cm 29.7cm; margin: 2cm } 		P { margin-bottom: 0.21cm } --> <!-- 		@page { size: 21cm 29.7cm; margin: 2cm } 		P { margin-bottom: 0.21cm } --></p>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[Regular Expressions in C# - Intro]]></title>
<link>http://spolnik.wordpress.com/2009/11/15/regular-expressions-in-c-intro/</link>
<pubDate>Sun, 15 Nov 2009 12:06:29 +0000</pubDate>
<dc:creator>Jacek Spólnik</dc:creator>
<guid>http://spolnik.wordpress.com/2009/11/15/regular-expressions-in-c-intro/</guid>
<description><![CDATA[Regular Expressions in C# Everyone developer who works with a text should know the regular expressio]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><h2>Regular Expressions in C#</h2>
<p>Everyone developer who works with a text should know the regular expressions (Regex). Regex is essential tool to text processing. We can find any phrase that we want if we create properly regular expression.</p>
<p>C# support of regular expressions:</p>
<ul>
<li><strong>Regex Class</strong> &#8211; the <a id="ctl00_MTCS_main_ctl01" href="http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex%28VS.71%29.aspx">Regex</a> class represents an immutable (read-only) regular expression. It also contains static methods that allow use of other regular expression classes without explicitly creating instances of the other classes.</li>
<li><strong>Match Class</strong> &#8211; the <a id="ctl00_MTCS_main_ctl03" href="http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.match%28VS.71%29.aspx">Match</a> class represents the results of a regular expression matching operation. The following example uses the <strong>Match</strong> method of the <strong>Regex</strong> class to return an object of type <strong>Match</strong> in order to find the first match in the input string.</li>
<li><strong>MatchCollection</strong> Class &#8211; the <a id="ctl00_MTCS_main_ctl06" href="http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.matchcollection%28VS.71%29.aspx">MatchCollection</a> class represents a sequence of successful non-overlapping matches. The collection is immutable (read-only) and has no public constructor. Instances of <strong>MatchCollection</strong> are returned by the <strong>Regex.Matches </strong>property.</li>
<li><strong>GroupCollection</strong> Class &#8211; the <a id="ctl00_MTCS_main_ctl09" href="http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.groupcollection%28VS.71%29.aspx">GroupCollection</a> class represents a collection of captured groups and returns the set of captured groups in a single match. The collection is immutable (read-only) and has no public constructor. Instances of <strong>GroupCollection</strong> are returned in the collection that the <strong>Match.Groups</strong> property returns.</li>
<li><strong>CaptureCollection</strong> Class &#8211; the <a id="ctl00_MTCS_main_ctl12" href="http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.capturecollection%28VS.71%29.aspx">CaptureCollection</a> class represents a sequence of captured substrings and returns the set of captures done by a single capturing group. A capturing group can capture more than one string in a single match because of <a id="ctl00_MTCS_main_ctl13" href="http://msdn.microsoft.com/en-us/library/3206d374%28VS.71%29.aspx">quantifiers</a>. The <strong>Captures</strong> property, an object of the <strong>CaptureCollection </strong>class, is provided as a member of the <a id="ctl00_MTCS_main_ctl14" href="http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.match%28VS.71%29.aspx">Match</a> and <a id="ctl00_MTCS_main_ctl15" href="http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.group%28VS.71%29.aspx">Group</a> classes to facilitate access to the set of captured substrings.</li>
<li><strong>Group</strong> Class &#8211; the <a id="ctl00_MTCS_main_ctl18" href="http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.group%28VS.71%29.aspx">Group</a> class represents the results from a single capturing group. Because<strong> Group</strong> can capture zero, one, or more strings in a single match (using quantifiers), it contains a collection of <strong>Capture</strong> objects. Because <strong>Group</strong> inherits from <strong>Capture</strong>, the last substring captured can be accessed directly (the <strong>Group</strong> instance itself is equivalent to the last item of the collection returned by the <strong>Captures</strong> property).</li>
<li><strong>Capture </strong>Class &#8211; the <a id="ctl00_MTCS_main_ctl23" href="http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.capture%28VS.71%29.aspx">Capture</a> class contains the results from a single subexpression capture.</li>
</ul>
<p>&#160;</p>
<h3>Bibliography</h3>
<ul>
<li><a class="wp-caption" href="http://msdn.microsoft.com/en-us/library/30wbz966%28VS.71%29.aspx" target="_blank">MSDN</a></li>
</ul>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[Calc vs. Excel Find and Replace with Regular Expressions]]></title>
<link>http://macproductionartist.wordpress.com/2009/11/13/cool-use-of-calc-vs-excel/</link>
<pubDate>Sat, 14 Nov 2009 01:21:59 +0000</pubDate>
<dc:creator>paeon</dc:creator>
<guid>http://macproductionartist.wordpress.com/2009/11/13/cool-use-of-calc-vs-excel/</guid>
<description><![CDATA[Calc is the open source version of Excel. It is part of the Open Office suite of applications you ca]]></description>
<content:encoded><![CDATA[Calc is the open source version of Excel. It is part of the Open Office suite of applications you ca]]></content:encoded>
</item>
<item>
<title><![CDATA[Regular Expressions Cookbook]]></title>
<link>http://referencedesigner.wordpress.com/2009/11/10/regular-expressions-cookbook/</link>
<pubDate>Tue, 10 Nov 2009 03:29:21 +0000</pubDate>
<dc:creator>abhinavi2</dc:creator>
<guid>http://referencedesigner.wordpress.com/2009/11/10/regular-expressions-cookbook/</guid>
<description><![CDATA[I have in my hand the book “Regular Expressions Cookbook”, by Jan Goyvaerts and Steven Levithan. Ver]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p>I have in my hand the book “Regular Expressions Cookbook”, by Jan Goyvaerts and Steven Levithan. Very well written and I would like it to be part of your library, if it is not already there.<br />
<a href="http://referencedesigner.com/blog/regular-expressions-cookbook/328/"> http://referencedesigner.com/blog/regular-expressions-cookbook/328/</a></p>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[hello php folks]]></title>
<link>http://beverlywebcontent.wordpress.com/2009/11/04/hello-php-folks/</link>
<pubDate>Wed, 04 Nov 2009 19:14:27 +0000</pubDate>
<dc:creator>Beverly</dc:creator>
<guid>http://beverlywebcontent.wordpress.com/2009/11/04/hello-php-folks/</guid>
<description><![CDATA[If you&#8217;re frustrated with the php manual, check out this link. Digg it if you like it, (I did,]]></description>
<content:encoded><![CDATA[If you&#8217;re frustrated with the php manual, check out this link. Digg it if you like it, (I did,]]></content:encoded>
</item>
<item>
<title><![CDATA[Zip Code Regular Expression]]></title>
<link>http://vijayvepa.wordpress.com/2009/11/03/zip-code-regular-expression/</link>
<pubDate>Tue, 03 Nov 2009 19:26:00 +0000</pubDate>
<dc:creator>vijayvepa</dc:creator>
<guid>http://vijayvepa.wordpress.com/2009/11/03/zip-code-regular-expression/</guid>
<description><![CDATA[Allows zip5 and zip5-zip4 formats ^(?&lt;zip5&gt;\d{5})$|(^(?&lt;zip5&gt;\d{5})-(?&lt;zip4&gt;\d{4})]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p>Allows zip5 and zip5-zip4 formats</p>
<blockquote><p><strong>^(?&#60;zip5&#62;\d{5})$&#124;(^(?&#60;zip5&#62;\d{5})-(?&#60;zip4&#62;\d{4})$)</strong></p></blockquote>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[Phone Number Regular Expression]]></title>
<link>http://vijayvepa.wordpress.com/2009/11/03/phone-number-regular-expression/</link>
<pubDate>Tue, 03 Nov 2009 18:56:19 +0000</pubDate>
<dc:creator>vijayvepa</dc:creator>
<guid>http://vijayvepa.wordpress.com/2009/11/03/phone-number-regular-expression/</guid>
<description><![CDATA[This one does not allow (755-433-4444 but accepts 757-333-4444 7774443333 444 444 4444 444 400-9444 ]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p>This one does not allow (755-433-4444 but accepts</p>
<p>757-333-4444</p>
<p>7774443333</p>
<p>444 444 4444</p>
<p>444 400-9444</p>
<blockquote><p><strong>^(((?&#60;area&#62;\d{3})-?)&#124;\((?&#60;area&#62;\d{3})\))\s*(?&#60;ph1&#62;\d{3})-?\s*(?&#60;ph2&#62;\d{4})$</strong></p></blockquote>
<p>Replacement string to use is ${area}${ph1}${ph2} &#8211; will print out 4443333333. Add spaces and chars as necesary</p>
<blockquote><p><strong> </strong></p></blockquote>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[Regular Expressions: Match all alphanumeric characters]]></title>
<link>http://notetomys11.wordpress.com/2009/11/03/regular-expressions-match-all-alphanumeric-characters/</link>
<pubDate>Tue, 03 Nov 2009 15:21:22 +0000</pubDate>
<dc:creator>notetomys11</dc:creator>
<guid>http://notetomys11.wordpress.com/2009/11/03/regular-expressions-match-all-alphanumeric-characters/</guid>
<description><![CDATA[[\p{L}\p{N}] Where \p{L} matches all letters \p{N} matches all numbers More information]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p><code>[\p{L}\p{N}]</code></p>
<p>Where<br />
<code>\p{L}</code> matches all letters<br />
<code>\p{N}</code> matches all numbers</p>
<p><a href="http://www.regular-expressions.info/unicode.html" target="_blank">More information</a></p>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[Genius @ work]]></title>
<link>http://blog.foppiano.org/2009/11/02/genius-work/</link>
<pubDate>Mon, 02 Nov 2009 21:45:20 +0000</pubDate>
<dc:creator>whitenoise</dc:creator>
<guid>http://blog.foppiano.org/2009/11/02/genius-work/</guid>
<description><![CDATA[Archimede said: &#8220;Give me a lever long enough and a fulcrum on which to place it, and I shall m]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p>Archimede said: <em>&#8220;Give me a lever long enough and a fulcrum on which to place it, and I shall move the world.&#8221;</em></p>
<p>Luca said: <em>&#8220;Give me a regular expression long enough and a string on which to apply it, and I shall fix this code.&#8221;</em></p>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[[Programming Resources]--Java, C/C+, HTML, Perl, Python...]]></title>
<link>http://skyhan.wordpress.com/2009/10/31/programming-resources-java-cc-html-perl-python/</link>
<pubDate>Sun, 01 Nov 2009 02:46:22 +0000</pubDate>
<dc:creator>Skyhan</dc:creator>
<guid>http://skyhan.wordpress.com/2009/10/31/programming-resources-java-cc-html-perl-python/</guid>
<description><![CDATA[I spent a lot time to get these resources, enjoy: Actionscript Quick reference/Cheatsheet for Action]]></description>
<content:encoded><![CDATA[I spent a lot time to get these resources, enjoy: Actionscript Quick reference/Cheatsheet for Action]]></content:encoded>
</item>
<item>
<title><![CDATA[[Programming Resources]--Java, C/C+, HTML, Perl, Python...]]></title>
<link>http://hanjinda.wordpress.com/2009/10/31/programming-resources-java-cc-html-perl-python/</link>
<pubDate>Sun, 01 Nov 2009 02:46:22 +0000</pubDate>
<dc:creator>Skyhan</dc:creator>
<guid>http://hanjinda.wordpress.com/2009/10/31/programming-resources-java-cc-html-perl-python/</guid>
<description><![CDATA[I spent a lot time to get these resources, enjoy: Actionscript Quick reference/Cheatsheet for Action]]></description>
<content:encoded><![CDATA[I spent a lot time to get these resources, enjoy: Actionscript Quick reference/Cheatsheet for Action]]></content:encoded>
</item>
<item>
<title><![CDATA[Выдираем ссылки из вебстранички]]></title>
<link>http://xaegr.wordpress.com/2009/10/31/extract-webpage-links/</link>
<pubDate>Sat, 31 Oct 2009 14:09:02 +0000</pubDate>
<dc:creator>Xaegr</dc:creator>
<guid>http://xaegr.wordpress.com/2009/10/31/extract-webpage-links/</guid>
<description><![CDATA[Для большинства случаев использования регулярных выражений в PowerShell применяются операторы -match]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p>Для большинства случаев использования регулярных выражений в PowerShell применяются операторы -match, -replace, и командлет Select-String. Но иногда их возможностей недостаточно, и тогда на помощь приходит класс [regex] принося всю мощь регулярных выражений .Net <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  Одно из самых простых его применений &#8211; выборка из куска текста нескольких вхождений одного выражения. Для примера &#8211; достанем ссылки из кода вебстраницы. Сразу замечу что выражение определяющее url не <a href="http://www.google.ru/search?hl=ru&#38;q=url+regular+expression+pattern">точное</a>, но в большинстве случаев его будет более чем достаточно.<br />
Итак, для начала объявим функцию для загрузки кода веб-страницы (это обновлённая версия <a href="http://xaegr.wordpress.com/2007/01/03/get-wwwstring-get-translation/">Get-WWWString</a>):</p>
<pre class='PowerShellColorizedScript'><span style='color:#00008b;'>function</span> <span style='color:#8a2be2;'>Get-WwwString</span> <span style='color:#000000;'>(</span><span style='color:#008080;'>[string]</span><span style='color:#ff4500;'>$Url</span><span style='color:#a9a9a9;'>,</span> <span style='color:#008080;'>[string]</span><span style='color:#ff4500;'>$Encoding</span><span style='color:#a9a9a9;'>=</span><span style='color:#8b0000;'>"windows-1251"</span><span style='color:#a9a9a9;'>,</span> <span style='color:#008080;'>[System.Management.Automation.PSCredential]</span><span style='color:#ff4500;'>$ProxyCredential</span> <span style='color:#a9a9a9;'>=</span> <span style='color:#ff4500;'>$GlobalCreds</span><span style='color:#000000;'>)</span>
<span style='color:#000000;'>{</span>
        <span style='color:#ff4500;'>$wc</span> <span style='color:#a9a9a9;'>=</span> <span style='color:#0000ff;'>new-object</span> <span style='color:#8a2be2;'>System.Net.WebClient</span>
        <span style='color:#ff4500;'>$wc</span><span style='color:#a9a9a9;'>.</span><span style='color:#000000;'>Encoding</span> <span style='color:#a9a9a9;'>=</span> <span style='color:#008080;'>[System.Text.Encoding]</span><span style='color:#a9a9a9;'>::</span><span style='color:#000000;'>GetEncoding</span><span style='color:#000000;'>(</span><span style='color:#ff4500;'>$Encoding</span><span style='color:#000000;'>)</span>
        <span style='color:#ff4500;'>$wc</span><span style='color:#a9a9a9;'>.</span><span style='color:#000000;'>UseDefaultCredentials</span> <span style='color:#a9a9a9;'>=</span> <span style='color:#ff4500;'>$true</span>
        <span style='color:#00008b;'>if</span> <span style='color:#000000;'>(</span><span style='color:#ff4500;'>$ProxyCredential</span><span style='color:#000000;'>)</span> <span style='color:#000000;'>{</span><span style='color:#ff4500;'>$wc</span><span style='color:#a9a9a9;'>.</span><span style='color:#000000;'>Proxy</span><span style='color:#a9a9a9;'>.</span><span style='color:#000000;'>Credentials</span> <span style='color:#a9a9a9;'>=</span> <span style='color:#ff4500;'>$ProxyCredential</span><span style='color:#a9a9a9;'>.</span><span style='color:#000000;'>GetNetworkCredential</span><span style='color:#000000;'>(</span><span style='color:#000000;'>)</span><span style='color:#000000;'>}</span>
        <span style='color:#ff4500;'>$wc</span><span style='color:#a9a9a9;'>.</span><span style='color:#000000;'>DownloadString</span><span style='color:#000000;'>(</span><span style='color:#ff4500;'>$url</span><span style='color:#000000;'>)</span>
<span style='color:#000000;'>}</span></pre>
<p>Теперь загружаем страницу, и вытаскиваем ссылки&#8230;</p>
<pre class='PowerShellColorizedScript'><span style='color:#ff4500;'>$Text</span> <span style='color:#a9a9a9;'>=</span> <span style='color:#0000ff;'>Get-WwwString</span> <span style='color:#8b0000;'>"http://ya.ru"</span>            

<span style='color:#008080;'>[regex]</span><span style='color:#ff4500;'>$reg</span> <span style='color:#a9a9a9;'>=</span> <span style='color:#8b0000;'>'"(\w+://[^"]+)"'</span>
<span style='color:#ff4500;'>$match</span> <span style='color:#a9a9a9;'>=</span> <span style='color:#ff4500;'>$reg</span><span style='color:#a9a9a9;'>.</span><span style='color:#000000;'>match</span><span style='color:#000000;'>(</span><span style='color:#ff4500;'>$Text</span><span style='color:#000000;'>)</span>
<span style='color:#00008b;'>while</span> <span style='color:#000000;'>(</span><span style='color:#ff4500;'>$match</span><span style='color:#a9a9a9;'>.</span><span style='color:#000000;'>Success</span><span style='color:#000000;'>)</span>
<span style='color:#000000;'>{</span>
    <span style='color:#ff4500;'>$match</span><span style='color:#a9a9a9;'>.</span><span style='color:#000000;'>Groups</span><span style='color:#a9a9a9;'>[</span><span style='color:#800080;'>1</span><span style='color:#a9a9a9;'>]</span><span style='color:#a9a9a9;'>.</span><span style='color:#000000;'>value</span>
    <span style='color:#ff4500;'>$match</span> <span style='color:#a9a9a9;'>=</span> <span style='color:#ff4500;'>$match</span><span style='color:#a9a9a9;'>.</span><span style='color:#000000;'>nextMatch</span><span style='color:#000000;'>(</span><span style='color:#000000;'>)</span>
<span style='color:#000000;'>}</span></pre>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[Regular expressions - preg_match_all()]]></title>
<link>http://web4us.wordpress.com/2009/10/21/regular-expressions-preg_match_all/</link>
<pubDate>Wed, 21 Oct 2009 12:23:33 +0000</pubDate>
<dc:creator>webforus</dc:creator>
<guid>http://web4us.wordpress.com/2009/10/21/regular-expressions-preg_match_all/</guid>
<description><![CDATA[Preg_Match_All is used to search a string for specific patterns and stores the results in an array. ]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p><em>Preg_Match_All</em> is used to search a string for specific patterns and stores the results in an array.  Unlike <em>preg_match</em> which stops searching after it finds a match, <em>preg_match_all</em> searches the entire string and records all matches.  It is phrased as: <strong>preg_match_all (pattern, string, $array, optional_ordering, optional_offset)</strong></p>
<p><!--more--></p>
<pre>&#60;?php
 $data = "The party will start at 10:30 pm and run untill 12:30 am";
 <strong>preg_match_all</strong>('/(\d+:\d+)\s*(am&#124;pm)/', $data, $match, <em>PREG_PATTERN_ORDER</em>);
 print_r($match);
 ?&#62;
<strong>Output</strong>
Array
(
    [0] =&#62; Array
        (
            [0] =&#62; 10:30 pm
            [1] =&#62; 12:30 am
        )

    [1] =&#62; Array
        (
            [0] =&#62; 10:30
            [1] =&#62; 12:30
        )

    [2] =&#62; Array
        (
            [0] =&#62; pm
            [1] =&#62; am
        )

)
</pre>
<p>In our first example we use PREG_PATTERN_ORDER. We are searching for 2 things; one is the time, the other is it&#8217;s am/pm tag. Our results are outputted to $match, as an array where $match[0] contains all matches, $match[1] contains all data matching our first sub-serach (the time) and $match[2] contains all data matching our second sub-search (am/pm).</p>
<pre>&#60;?php
 $data = "The party will start at 10:30 pm and run untill 12:30 am";
 <strong>preg_match_all</strong>('/(\d+:\d+)\s*(am&#124;pm)/', $data, $match, <em>PREG_SET_ORDER</em>);
print_r($match);
 ?&#62;
<strong>Output</strong>
Array
(
    [0] =&#62; Array
        (
            [0] =&#62; 10:30 pm
            [1] =&#62; 10:30
            [2] =&#62; pm
        )

    [1] =&#62; Array
        (
            [0] =&#62; 12:30 am
            [1] =&#62; 12:30
            [2] =&#62; am
        )

)

(
    [0] =&#62; Array
        (
            [0] =&#62; 10:30 pm
            [1] =&#62; 10:30
            [2] =&#62; pm
        )

    [1] =&#62; Array
        (
            [0] =&#62; 12:30 am
            [1] =&#62; 12:30
            [2] =&#62; am
        )

)
</pre>
<p>In our second example we use PREG_SET_ORDER. This puts each full result into an array. The first result is $match[0], with $match[0][0] being the full match, $match[0][1] being the first sub-match and $match[0][2] being the second sub-match. </p>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[Regular expressions - preg_grep()]]></title>
<link>http://web4us.wordpress.com/2009/10/21/regular-expressions-preg_grep/</link>
<pubDate>Wed, 21 Oct 2009 11:45:07 +0000</pubDate>
<dc:creator>webforus</dc:creator>
<guid>http://web4us.wordpress.com/2009/10/21/regular-expressions-preg_grep/</guid>
<description><![CDATA[The PHP function, preg_grep, is used to search an array for specific patterns and then return a new ]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p>The PHP function, <em>preg_grep</em>, is used to search an array for specific patterns and then return a new array based on that filtering. There are two ways to return the results. You can return them as is, or you can invert them (instead of only returning what matches, it would only return what does not match.) It is phrased as: <strong>preg_grep ( search_pattern, $your_array, optional_inverse )</strong> The search_pattern needs to be a regular expression.  If you are unfamiliar with them this article gives you an overview of the syntax.</p>
<pre>&#60;?php
$data = array(0, 1, 2, 'three', 4, 5, 'six', 7, 8, 'nine', 10);
$mod1 = <strong>preg_grep</strong>("/4&#124;5&#124;6/", $data);
$mod2 = <strong>preg_grep</strong>("/[0-9]/", $data, PREG_GREP_INVERT);
print_r($mod1);
echo "&#60;br&#62;";
print_r($mod2);
?&#62;
</pre>
<p>This code would result in the following data:<br />
Array ( [4] =&#62; 4 [5] =&#62; 5 )<br />
Array ( [3] =&#62; three [6] =&#62; six [9] =&#62; nine )</p>
<p>First we assign our $data variable. This is a list of numbers, some in alpha form, others in numeric. The first thing we run is called $mod1. Here we are searching for anything that contains 4, 5, or 6. When our result is printed below we only get 4 and 5, because 6 was written as &#8217;six&#8217; so it did not match our search.</p>
<p>Next we run $mod2, which is searching for anything that contains a numeric character. But this time we include PREG_GREP_INVERT. This will invert our data, so instead of outputting numbers, it outputs all of our entries that where not numeric (three, six and nine).</p>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[Regular expressions - preg_match()]]></title>
<link>http://web4us.wordpress.com/2009/10/21/regular-expressions/</link>
<pubDate>Wed, 21 Oct 2009 11:06:58 +0000</pubDate>
<dc:creator>webforus</dc:creator>
<guid>http://web4us.wordpress.com/2009/10/21/regular-expressions/</guid>
<description><![CDATA[PHP regular expressions seems to be a quite complicated area especially if you are not an experience]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p>PHP regular expressions seems to be a quite complicated area especially if you are not an experienced Unix user. Historically regular expressions were originally designed to help working with strings under Unix systems.</p>
<p>Using regular expressions you can easy find a pattern in a string and/or replace it if you want. This is a very powerful tool in your hand, but be careful as it is slower than the standard string manipulation functions.<br />
<!--more--><br />
<strong>Regular expression types</strong></p>
<p>There are 2 types of  regular expressions:</p>
<ul>
<li>POSIX Extended</li>
<li>Perl Compatible</li>
</ul>
<p>The ereg, eregi, &#8230; are the POSIX versions and preg_match, preg_replace, &#8230; are the Perl version. It is important that using Perl compatible regular expressions the expression should be enclosed in the delimiters, a forward slash (/), for example. However this version is more powerful and faster as well than the POSIX one.</p>
<p><strong>The regular expressions basic syntax</strong></p>
<p>To use regular expressions first you need to learn the syntax of the patterns. We can group the characters inside a pattern like this:</p>
<ul>
<li>Normal characters which match themselves like hello</li>
<li>Start and end indicators as ^ and $</li>
<li>Count indicators like +,*,?</li>
<li>Logical operator like &#124;</li>
<li>Grouping with {},(),[]</li>
</ul>
<p>An example pattern to check valid emails looks like this:</p>
<div>
<div>Code:</div>
<p>^[a-zA-Z0-9._-]+@[a-zA-Z0-9-]+\.[a-zA-Z.]{2,5}$</pre>
</div>
<p>The code to check the email using Perl compatible regular expression looks like this:</p>
<pre>
&#60;?php
$pattern = "/^[a-zA-Z0-9._-]+@[a-zA-Z0-9-]+\.[a-zA-Z.]{2,5}$/";
$str = "test@test.com";
if(preg_match($pattern,$str))
	echo "match";
else
	echo "not match";
?&#62;
</pre>
<p>Here is syntax pattern</p>
<table cellspacing="1" cellpadding="5" border="0" bgcolor="#f0f0f0" width="900">
<tbody>
<tr>
<th>
<p>Regular expression (pattern)</p>
</th>
<th>
<p>Match (subject)</p>
</th>
<th>
<p>Not match (subject)</p>
</th>
<th>Comment</th>
</tr>
<tr>
<td>world</td>
<td>Hello world</td>
<td>Hello Jim</td>
<td>Match if the pattern is present anywhere in the subject</td>
</tr>
<tr>
<td>^world</td>
<td>world class</td>
<td>Hello world</td>
<td>Match if the pattern is present at the beginning of the subject</td>
</tr>
<tr>
<td>world$</td>
<td>Hello world</td>
<td>world class</td>
<td>Match if the pattern is present at the end of the subject</td>
</tr>
<tr>
<td>world/i</td>
<td>This WoRLd</td>
<td>Hello Jim</td>
<td>Makes a search in case insensitive mode</td>
</tr>
<tr>
<td>^world$</td>
<td>world</td>
<td>Hello world</td>
<td>The string contains only the "world"</td>
</tr>
<tr>
<td>world*</td>
<td>worl, world, worlddd</td>
<td>wor</td>
<td>There is 0 or more "d" after "worl"</td>
</tr>
<tr>
<td>world+</td>
<td>world, worlddd</td>
<td>worl</td>
<td>There is at least 1 "d" after "worl"</td>
</tr>
<tr>
<td>world?</td>
<td>worl, world, worly</td>
<td>wor, wory</td>
<td>There is 0 or 1 "d" after "worl"</td>
</tr>
<tr>
<td>world{1}</td>
<td>world</td>
<td>worly</td>
<td>There is 1 "d" after "worl"</td>
</tr>
<tr>
<td>world{1,}</td>
<td>world, worlddd</td>
<td>worly</td>
<td>There is 1 ore more "d" after "worl"</td>
</tr>
<tr>
<td>world{2,3}</td>
<td>worldd, worlddd</td>
<td>world</td>
<td>There are 2 or 3 "d" after "worl"</td>
</tr>
<tr>
<td>wo(rld)*</td>
<td>wo, world, worldold</td>
<td>wa</td>
<td>There is 0 or more "rld" after "wo"</td>
</tr>
<tr>
<td>earth&#124;world</td>
<td>earth, world</td>
<td>sun</td>
<td>The string contains the "earth" or the "world"</td>
</tr>
<tr>
<td>w.rld</td>
<td>world, wwrld</td>
<td>wrld</td>
<td>Any character in place of the dot.</td>
</tr>
<tr>
<td>^.{5}$</td>
<td>world, earth</td>
<td>sun</td>
<td>A string with exactly 5 characters</td>
</tr>
<tr>
<td>[abc]</td>
<td>abc, bbaccc</td>
<td>sun</td>
<td>There is an "a" or "b" or "c" in the string</td>
</tr>
<tr>
<td>[a-z]</td>
<td>world</td>
<td>WORLD</td>
<td>There are any lowercase letter in the string</td>
</tr>
<tr>
<td>[a-zA-Z]</td>
<td>world, WORLD, Worl12</td>
<td>123</td>
<td>There are any lower- or uppercase letter in the string</td>
</tr>
<tr>
<td>[^wW]</td>
<td>earth</td>
<td>w, W</td>
<td>The actual character can not be a "w" or "W"</td>
</tr>
</tbody>
</table>
<p>The <em>Preg_Match</em> PHP function is used to search a string, and return a 1 or 0. If the search was successful a 1 will be returned, and if it was not found a 0 will be returned. Although other variables can be added, it is most simply phrased as: <strong>preg_match(search_pattern, your_string)</strong>.  The search_pattern needs to be a regular expression.  If you are unfamiliar with them this article gives you an overview of the syntax.</p>
<pre>
&#60;?php
$data = "I had a box of cerial for breakfast today, and then I drank some juice.";
if (preg_match("/juice/", $data))
{
echo "You had juice.";
}
else
{
echo "You had did not have juice.";
}
if (preg_match("/eggs/", $data))
{
echo "You had eggs.";
}
else
{
echo "You had did not have eggs.";
}
?&#62;
</pre>
<p>The code above uses preg_match to check for a key word (first juice then egg) and replies based on whether it is true (1) or false (0).</p>
<p>Another example<br />
regular expression to validate time</p>
<pre>
// Must have one and only one space...
"/(0[1-9]¦1[0-2]):[0-5][0-9] ([ap]m¦[AP]M)/"
// Must have at least one space...
"/(0[1-9]¦1[0-2]):[0-5][0-9]\s+([ap]m¦[AP]M)/"
</pre>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[Wildcard strings for Academic Copyeditors/Regular Expressions]]></title>
<link>http://babelediting.wordpress.com/2009/10/20/wildcard-strings-for-academic-copyeditorsregular-expressions/</link>
<pubDate>Tue, 20 Oct 2009 13:05:24 +0000</pubDate>
<dc:creator>babelediting</dc:creator>
<guid>http://babelediting.wordpress.com/2009/10/20/wildcard-strings-for-academic-copyeditorsregular-expressions/</guid>
<description><![CDATA[Here&#8217;s a list of basic regular expressions/MS Word wildcard search strings, useful for academi]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p>Here&#8217;s a list of basic regular expressions/MS Word wildcard search strings, useful for academic copyediting world (esp. to rip through a reference list).  The logic of some of them is still a bit shaky &#8211; use with sensible precaution.</p>
<p><strong>Replaces a bracketed year with a year preceded and followed by periods [(2000) to . 2000.]</strong></p>
<p>\(([0-9]{4})\)<br />
. \1.</p>
<p><strong>The same, but with an index letter for the year ((2000a) to . 2000a.)</strong></p>
<p>\(([0-9]{4}[abc])\)<br />
. \1.</p>
<p><strong>Replaces hyphen between two numbers with en-dash.</strong></p>
<p>([0-9])-([0-9])<br />
\1^=\2</p>
<p><strong>Closing quotation mark, comma, or fullstop ordering from Oxford to Chicago style &#8211; note: does not find Smart Quotes.</strong></p>
<p>(&#8220;)([,.])<br />
\2\1</p>
<p><strong>Same as previous, finding a smart quote</strong><br />
(”)([,.])<br />
\2\1</p>
<p><strong>Closing quotation mark, comma or fullstop from Chicago to Oxford style (not Smart Quotes</strong>)</p>
<p>([,.])(&#8220;)<br />
\2\1</p>
<p><strong>Replace smart single quotes with smart double quotes (be careful with this one in case there are apostrophes around).</strong></p>
<p>‘(*)’<br />
“\1”</p>
<p><strong>Replace smart double quotes with smart single quotes</strong></p>
<p>“(*)”<br />
‘\1’</p>
<p><strong>Change  &#8220;Vol. #, No. #,&#8221; to &#8220;# (#):&#8221;</strong></p>
<p>, Vol. ([0-9]{1,}), No. ([0-9]{1,}),<br />
\1 (\2):</p>
<p><strong>Author-date method; in text, change comma-separated author list to semicolon separated.</strong><br />
([0-9]{4}), ([! ]{1,} [0-9]{4})<br />
\1; \2</p>
</div>]]></content:encoded>
</item>

</channel>
</rss>
