Files
tamigo-cli/venv/lib/python3.12/site-packages/wcwidth/__pycache__/grapheme.cpython-312.pyc

150 lines
14 KiB
Plaintext
Raw Normal View History

<EFBFBD>
<00><16>iF4<00><01><><00>dZddlmZddlmZddlmZddlmZm Z ddl
m
Z ddl m Z mZmZmZmZmZmZmZmZmZmZmZmZmZerdd lmZd
ZGd <0B>d e<04>Zed <0A><0E>dd<0F><04>Zed <0A><0E>dd<10><04>Z ed <0A><0E>dd<11><04>Z!ed <0A><0E>dd<12><04>Z"ed <0A><0E>dd<13><04>Z#Gd<14>de <09>Z$ed <0A><0E>dd<16><04>Z% d d<17>Z& d! d"d<19>Z'd#d<1A>Z(d$d<1B>Z) d! d"d<1C>Z*y)%z<>
Grapheme cluster segmentation following Unicode Standard Annex #29.
This module provides pure-Python implementation of the grapheme cluster boundary algorithm as
defined in UAX #29: Unicode Text Segmentation.
https://www.unicode.org/reports/tr29/
<EFBFBD>)<01> annotations)<01>IntEnum)<01> lru_cache)<02> TYPE_CHECKING<4E>
NamedTuple<EFBFBD>)<01>bisearch)<0E>
GRAPHEME_L<EFBFBD>
GRAPHEME_T<EFBFBD>
GRAPHEME_V<EFBFBD> GRAPHEME_LV<4C> INCB_EXTEND<4E> INCB_LINKER<45> GRAPHEME_LVT<56>INCB_CONSONANT<4E>GRAPHEME_EXTEND<4E>GRAPHEME_CONTROL<4F>GRAPHEME_PREPEND<4E>GRAPHEME_SPACINGMARK<52>EXTENDED_PICTOGRAPHIC<49>GRAPHEME_REGIONAL_INDICATOR)<01>Iterator<6F> c<01>H<00>eZdZdZdZdZdZdZdZdZ dZ
d Z d
Z d Z d Zd ZdZdZy)<11>GCBz'Grapheme Cluster Break property values.rr<00><00><00><00><00><00><00><00> <00>
<00> <00> <00> N)<12>__name__<5F>
__module__<EFBFBD> __qualname__<5F>__doc__<5F>OTHER<45>CR<43>LF<4C>CONTROL<4F>EXTEND<4E>ZWJ<57>REGIONAL_INDICATOR<4F>PREPEND<4E> SPACING_MARK<52>L<>V<>T<>LV<4C>LVT<56><00><00>V/home/daniel/Projects/tamigo-cli/venv/lib/python3.12/site-packages/wcwidth/grapheme.pyrr,sL<00><00>1<> <0A>E<EFBFBD>
<EFBFBD>B<EFBFBD>
<EFBFBD>B<EFBFBD><0F>G<EFBFBD> <0E>F<EFBFBD>
<0B>C<EFBFBD><1A><16><0F>G<EFBFBD><14>L<EFBFBD> <09>A<EFBFBD>
<EFBFBD>A<EFBFBD>
<EFBFBD>A<EFBFBD> <0B>B<EFBFBD>
<0C>Cr;ri)<01>maxsizec<01> <00>|dk(rtjS|dk(rtjS|dk(rtjSt |t
<00>rtj St |t<00>rtjSt |t<00>rtjSt |t<00>rtjSt |t<00>rtjSt |t<00>rtj St |t"<00>rtj$St |t&<00>rtj(St |t*<00>rtj,St |t.<00>rtj0Stj2S)z;Return the Grapheme_Cluster_Break property for a codepoint.r'r$i )rr-r.r1<00> _bisearchrr/rr0rr2rr3rr4r
r5r r6r r7r r8rr9r,<00><01>ucss r<<00>_grapheme_cluster_breakrBBs <00><00>
 <0B>f<EFBFBD>}<7D><12>v<EFBFBD>v<EFBFBD> <0A>
<EFBFBD>f<EFBFBD>}<7D><12>v<EFBFBD>v<EFBFBD> <0A>
<EFBFBD>f<EFBFBD>}<7D><12>w<EFBFBD>w<EFBFBD><0E><10><13>&<26>'<27><12>{<7B>{<7B><1A><10><13>o<EFBFBD>&<26><12>z<EFBFBD>z<EFBFBD><19><10><13>1<>2<><12>%<25>%<25>%<25><10><13>&<26>'<27><12>{<7B>{<7B><1A><10><13>*<2A>+<2B><12><1F><1F><1F><10><13>j<EFBFBD>!<21><12>u<EFBFBD>u<EFBFBD> <0C><10><13>j<EFBFBD>!<21><12>u<EFBFBD>u<EFBFBD> <0C><10><13>j<EFBFBD>!<21><12>u<EFBFBD>u<EFBFBD> <0C><10><13>k<EFBFBD>"<22><12>v<EFBFBD>v<EFBFBD> <0A><10><13>l<EFBFBD>#<23><12>w<EFBFBD>w<EFBFBD><0E> <0E>9<EFBFBD>9<EFBFBD>r;c<01>4<00>tt|t<00><00>S)z6Check if codepoint has Extended_Pictographic property.)<03>boolr?rr@s r<<00>_is_extended_pictographicrEes<00><00> <10> <09>#<23>4<>5<> 6<>6r;c<01>4<00>tt|t<00><00>S)z,Check if codepoint has InCB=Linker property.)rDr?rr@s r<<00>_is_incb_linkerrGk<00><00><00> <10> <09>#<23>{<7B>+<2B> ,<2C>,r;c<01>4<00>tt|t<00><00>S)z/Check if codepoint has InCB=Consonant property.)rDr?rr@s r<<00>_is_incb_consonantrJqs<00><00> <10> <09>#<23>~<7E>.<2E> /<2F>/r;c<01>4<00>tt|t<00><00>S)z,Check if codepoint has InCB=Extend property.)rDr?rr@s r<<00>_is_incb_extendrLwrHr;c<01>&<00>eZdZUdZded<ded<y)<07> BreakResultz*Result of grapheme cluster break decision.rD<00> should_break<61>int<6E>ri_countN)r(r)r*r+<00>__annotations__r:r;r<rNrN}s<00><00>4<><16><16><11>Mr;rNc<01><00>|tjk(r |tjk(r tdd<02><03>S|tjtjtjfvr tdd<02><03>S|tjtjtjfvr tdd<02><03>S|tj
k(rM|tj
tj tjtjfvr tdd<02><03>S|tjtj fvr/|tj tjfvr tdd<02><03>S|tjtjfvr |tjk(r tdd<02><03>S|tjk(r tdd<02><03>S|tjk(r tdd<02><03>S|tjk(r tdd<02><03>Sy)z<>
Check simple GCB-pair-based break rules (cacheable).
Returns BreakResult for rules that can be determined from GCB properties alone, or None if
complex lookback rules (GB9c, GB11) need to be checked.
Fr<00>rOrQTN) rr-r.rNr/r5r6r8r9r7r0r4r3)<02>prev_gcb<63>curr_gcbs r<<00>_simple_break_checkrW<00>st<00><00><10>3<EFBFBD>6<EFBFBD>6<EFBFBD><19>h<EFBFBD>#<23>&<26>&<26>0<><1A><05><01>:<3A>:<3A><10>C<EFBFBD>K<EFBFBD>K<EFBFBD><13><16><16><13><16><16>0<>0<><1A><04>q<EFBFBD>9<>9<><10>C<EFBFBD>K<EFBFBD>K<EFBFBD><13><16><16><13><16><16>0<>0<><1A><04>q<EFBFBD>9<>9<><10>3<EFBFBD>5<EFBFBD>5<EFBFBD><18>X<EFBFBD>#<23>%<25>%<25><13><15><15><03><06><06><03><07><07>)H<>H<><1A><05><01>:<3A>:<3A><10>C<EFBFBD>F<EFBFBD>F<EFBFBD>C<EFBFBD>E<EFBFBD>E<EFBFBD>?<3F>"<22>x<EFBFBD>C<EFBFBD>E<EFBFBD>E<EFBFBD>3<EFBFBD>5<EFBFBD>5<EFBFBD>><3E>'A<><1A><05><01>:<3A>:<3A><10>C<EFBFBD>G<EFBFBD>G<EFBFBD>S<EFBFBD>U<EFBFBD>U<EFBFBD>#<23>#<23><08>C<EFBFBD>E<EFBFBD>E<EFBFBD>(9<><1A><05><01>:<3A>:<3A><10>3<EFBFBD>:<3A>:<3A><1D><1A><05><01>:<3A>:<3A><10>3<EFBFBD>#<23>#<23>#<23><1A><05><01>:<3A>:<3A><10>3<EFBFBD>;<3B>;<3B><1E><1A><05><01>:<3A>:<3A> r;c<01>$<00>t||<01>}|<05>|S|tjk(r tdd<02><03>St ||<00>}t |<06>r`d}|dz
}|dk\rTt ||<00>} t | <09>rd}|dz}n-t| <09>r|dz}nt | <09>r|r tdd<02><03>Snn|dk\r<01>T|tjk(rft|<06>r[|dz
}|dk\rQt ||<00>} t| <09>}
|
tjk(r|dz}nt| <09>r tdd<02><03>Sn|dk\r<01>Q|tjk(r8|tjk(r%|dzdk(rtd|dz<00><03>Stdd<04><03>S|tjk(rdnd}td|<04><03>S)z<>
Determine if there should be a grapheme cluster break between prev and curr.
Implements UAX #29 grapheme cluster boundary rules.
FrrTrTr) rWrr1rN<00>ordrJrGrLrErBr0r2) rUrV<00>text<78>curr_idxrQ<00>result<6C>curr_ucs<63>
has_linker<EFBFBD>i<>prev_ucs<63> prev_props r<<00> _should_breakrb<00>s<><00><00>!<21><18>8<EFBFBD> 4<>F<EFBFBD> <0A><19><15> <0A><10>3<EFBFBD>7<EFBFBD>7<EFBFBD><1A><1A><05><01>:<3A>:<3A>
<13>4<EFBFBD><08>><3E>"<22>H<EFBFBD><19>(<28>#<23><1A>
<EFBFBD> <14>q<EFBFBD>L<EFBFBD><01><0F>1<EFBFBD>f<EFBFBD><1A>4<EFBFBD><01>7<EFBFBD>|<7C>H<EFBFBD><1E>x<EFBFBD>(<28>!<21>
<EFBFBD><11>Q<EFBFBD><06><01> <20><18>*<2A><11>Q<EFBFBD><06><01>#<23>H<EFBFBD>-<2D><1D>&<26>E<EFBFBD>A<EFBFBD>F<>F<><15><15><10>1<EFBFBD>f<EFBFBD><10>3<EFBFBD>7<EFBFBD>7<EFBFBD><1A>8<><18>B<> <14>q<EFBFBD>L<EFBFBD><01><0F>1<EFBFBD>f<EFBFBD><1A>4<EFBFBD><01>7<EFBFBD>|<7C>H<EFBFBD>/<2F><08>9<>I<EFBFBD><18>C<EFBFBD>J<EFBFBD>J<EFBFBD>&<26><11>Q<EFBFBD><06><01>*<2A>8<EFBFBD>4<>"<22><05><01>B<>B<><15><10>1<EFBFBD>f<EFBFBD><10>3<EFBFBD>)<29>)<29>)<29>h<EFBFBD>#<23>:P<>:P<>.P<> <13>a<EFBFBD><<3C>1<EFBFBD> <1C><1E>E<EFBFBD>H<EFBFBD>q<EFBFBD>L<EFBFBD>I<> I<><1A><04>q<EFBFBD>9<>9<><1D><03> 6<> 6<>6<>q<EFBFBD>A<EFBFBD>H<EFBFBD> <16>D<EFBFBD>8<EFBFBD> <<3C><r;Nc#<01><>K<00>|syt|<00>}|<02>|}||k\s||k\ryt||<03>}|}d}tt||<00><00>}|tj
k(rd}t |dz|<02>D]K}tt||<00><00>}t|||||<05>} | j}| jr |||<00><01>|}|}<06>M|||<00><01>y<01>w)aP
Iterate over grapheme clusters in a Unicode string.
Grapheme clusters are "user-perceived characters" - what a user would
consider a single character, which may consist of multiple Unicode
codepoints (e.g., a base character with combining marks, emoji sequences).
:param unistr: The Unicode string to segment.
:param start: Starting index (default 0).
:param end: Ending index (default len(unistr)).
:yields: Grapheme cluster substrings.
Example::
>>> list(iter_graphemes('cafe\u0301'))
['c', 'a', 'f', 'e\u0301']
>>> list(iter_graphemes('\U0001F468\u200D\U0001F469\u200D\U0001F467'))
['o', 'k', '\U0001F468\u200D\U0001F469\u200D\U0001F467']
>>> list(iter_graphemes('\U0001F1FA\U0001F1F8'))
['o', 'k', '\U0001F1FA\U0001F1F8']
.. versionadded:: 0.3.0
Nrr)
<EFBFBD>len<65>minrBrYrr2<00>rangerbrQrO)
<EFBFBD>unistr<74>start<72>end<6E>length<74> cluster_startrQrU<00>idxrVr\s
r<<00>iter_graphemesrm<00>s<><00><00><><00>8 <12><0E> <10><16>[<5B>F<EFBFBD>
<EFBFBD>{<7B><14><03> <0C><03>|<7C>u<EFBFBD><06><EFBFBD><0E>
<0A>c<EFBFBD>6<EFBFBD>
<1A>C<EFBFBD><1A>M<EFBFBD><10>H<EFBFBD>'<27>s<EFBFBD>6<EFBFBD>%<25>=<3D>'9<>:<3A>H<EFBFBD><10>3<EFBFBD>)<29>)<29>)<29><14><08><14>U<EFBFBD>Q<EFBFBD>Y<EFBFBD><03>$<24>
<1C><03>*<2A>3<EFBFBD>v<EFBFBD>c<EFBFBD>{<7B>+;<3B><<3C><08><1E>x<EFBFBD><18>6<EFBFBD>3<EFBFBD><08>I<><06><19>?<3F>?<3F><08> <11> <1E> <1E><18><1D>s<EFBFBD>+<2B> +<2B><1F>M<EFBFBD><1B><08>
<1C> <11><1D>s<EFBFBD>
#<23>#<23>s<00>B?Cc<01><><00>t||dz
<00>}|dk(r|dk\r||dz
dk(r|dz
S|dkrP|dk\rF|dk\rAt||dz
<00>}|dk\r+t|<03>tjk(rt ||dz
<00>S|dz
S|dz
}|dkDr]||z
t
krQt||<00>}d|cxkrdkrnnn4t|<05>tj k(rn|dz}|dkDr ||z
t
kr<01>Q|}tt||<00><00>}|tjk(rdnd}t|dz|<01>D]D} tt|| <00><00>}
t||
|| |<08>} | j}| jr| }|
}<07>F|S)a
Find the start of the grapheme cluster containing the character before pos.
Scans backwards from pos to find a safe starting point, then iterates forward using standard
break rules to find the actual cluster boundary.
:param text: The Unicode string.
:param pos: Position to search before (exclusive).
:returns: Start position of the grapheme cluster.
rr$r<00> <0A><>rr) rYrBrr3<00>_find_cluster_start<72>MAX_GRAPHEME_SCANr/r2rfrbrQrO) rZ<00>pos<6F> target_cp<63>prev_cp<63>
safe_start<EFBFBD>cprk<00>left_gcbrQr_<00> right_gcbr\s r<rqrq<s<><00><00><14>D<EFBFBD><13>q<EFBFBD><17>M<EFBFBD>"<22>I<EFBFBD><11>D<EFBFBD><18>S<EFBFBD>A<EFBFBD>X<EFBFBD>$<24>s<EFBFBD>Q<EFBFBD>w<EFBFBD>-<2D>4<EFBFBD>*?<3F><12>Q<EFBFBD>w<EFBFBD><0E><11>4<EFBFBD><17> <0E>!<21>8<EFBFBD> <09>T<EFBFBD>)<29><19>$<24>s<EFBFBD>Q<EFBFBD>w<EFBFBD>-<2D>(<28>G<EFBFBD><16>$<24><EFBFBD>#:<3A>7<EFBFBD>#C<>s<EFBFBD>{<7B>{<7B>#R<>*<2A>4<EFBFBD><13>q<EFBFBD><17>9<>9<><12>Q<EFBFBD>w<EFBFBD><0E><15>q<EFBFBD><17>J<EFBFBD>
<14>q<EFBFBD>.<2E>c<EFBFBD>J<EFBFBD>.<2E>2C<32>C<> <10><14>j<EFBFBD>!<21> "<22><02> <0F>2<EFBFBD> <1C><04> <1C> <11> "<22>2<EFBFBD> &<26>#<23>+<2B>+<2B> 5<> <11><12>a<EFBFBD><0F>
<EFBFBD> <15>q<EFBFBD>.<2E>c<EFBFBD>J<EFBFBD>.<2E>2C<32>C<><1F>M<EFBFBD>&<26>s<EFBFBD>4<EFBFBD>
<EFBFBD>+;<3B>'<<3C>=<3D>H<EFBFBD><1C><03> 6<> 6<>6<>q<EFBFBD>A<EFBFBD>H<EFBFBD> <12>:<3A><01>><3E>3<EFBFBD> '<27><1D><01>+<2B>C<EFBFBD><04>Q<EFBFBD><07>L<EFBFBD>9<> <09><1E>x<EFBFBD><19>D<EFBFBD>!<21>X<EFBFBD>F<><06><19>?<3F>?<3F><08> <11> <1E> <1E><1D>M<EFBFBD><1C><08> <1D> <19>r;c <01>L<00>|dkryt|t|t|<00><00><00>S)a<>
Find the grapheme cluster boundary immediately before a position.
:param unistr: The Unicode string to search.
:param pos: Position in the string (0 < pos <= len(unistr)).
:returns: Start index of the grapheme cluster containing the character at pos-1.
Example::
>>> grapheme_boundary_before('Hello \U0001F44B\U0001F3FB', 8)
6
>>> grapheme_boundary_before('a\r\nb', 3)
1
.. versionadded:: 0.3.6
r)rqrerd)rgrss r<<00>grapheme_boundary_beforer{ps&<00><00>" <0B>a<EFBFBD>x<EFBFBD><10> <1E>v<EFBFBD>s<EFBFBD>3<EFBFBD><03>F<EFBFBD> <0B>'<<3C> =<3D>=r;c#<01><>K<00>|syt|<00>}|<02>|n t||<03>}t|d<02>}||k\s||k\ry|}||kDr"t||<04>}||kry|||<00><01>|}||kDr<01>!yy<01>w)a<>
Iterate over grapheme clusters in reverse order (last to first).
:param unistr: The Unicode string to segment.
:param start: Starting index (default 0).
:param end: Ending index (default len(unistr)).
:yields: Grapheme cluster substrings in reverse order.
Example::
>>> list(iter_graphemes_reverse('cafe\u0301'))
['e\u0301', 'f', 'a', 'c']
.. versionadded:: 0.3.6
Nr)rdre<00>maxrq)rgrhrirjrsrks r<<00>iter_graphemes_reverser~<00>s<><00><00><><00>( <12><0E> <10><16>[<5B>F<EFBFBD><17>K<EFBFBD>&<26>S<EFBFBD><13>f<EFBFBD>%5<>C<EFBFBD> <0F><05>q<EFBFBD>M<EFBFBD>E<EFBFBD> <0C><03>|<7C>u<EFBFBD><06><EFBFBD><0E>
<0A>C<EFBFBD>
<0A><05>+<2B>+<2B>F<EFBFBD>C<EFBFBD>8<> <0A> <18>5<EFBFBD> <20> <11><14>]<5D>3<EFBFBD>'<27>'<27><1B><03> <0E><05>+<2B>s <00>AA"<01> A")rArP<00>returnr)rArPrrD)rUrrVrrzBreakResult | None) rUrrVrrZ<00>strr[rPrQrPrrN)rN)rgr<>rhrPriz
int | Nonerz Iterator[str])rZr<>rsrPrrP)rgr<>rsrPrrP)+r+<00>
__future__r<00>enumr<00> functoolsr<00>typingrrr r?<00>table_graphemer
r r r rrrrrrrrrr<00>collections.abcrrrrrBrErGrJrLrNrWrbrmrqr{r~r:r;r<<00><module>r<>s<><00><01><04>#<23><19><1F>,<2C>,<2C> :<3A> :<3A> :<3A> :<3A><11>(<28><17><11> <0A>'<27> <0A>, <0B>4<EFBFBD><18><15><19><15>D <0B>4<EFBFBD><18>7<><19>7<>
 <0B>4<EFBFBD><18>-<2D><19>-<2D>
 <0B>4<EFBFBD><18>0<><19>0<>
 <0B>4<EFBFBD><18>-<2D><19>-<2D>
<12>*<2A><12> <0B>4<EFBFBD><18>-<10><19>-<10>`@=<3D><11>@=<3D><11>@=<3D> <0E>@=<3D><12> @=<3D>
<12> @=<3D> <11> @=<3D>J<13><1A>A$<24> <0F>A$<24> <0E>A$<24>
<14>A$<24><13> A$<24>H1<19>h><3E>0<13><1A>&<1C> <0F>&<1C> <0E>&<1C>
<14>&<1C><13> &r;