From ec38b12bd5f5610430823a110c0277e00d004a64 Mon Sep 17 00:00:00 2001 From: WenmuZhou Date: Mon, 2 Aug 2021 19:42:10 +0800 Subject: [PATCH] add quick start --- doc/table/table.jpg | Bin 0 -> 24674 bytes ppstructure/README.md | 18 +++++++++++++---- ppstructure/README_ch.md | 20 ++++++++++++++----- ppstructure/table/README.md | 24 ++++++++++++++++++++--- ppstructure/table/README_ch.md | 34 +++++++++++++++++++++++++-------- 5 files changed, 76 insertions(+), 20 deletions(-) create mode 100644 doc/table/table.jpg diff --git a/doc/table/table.jpg b/doc/table/table.jpg new file mode 100644 index 0000000000000000000000000000000000000000..3daa619e52dc2471df62ea7767be3bff350b623f GIT binary patch literal 24674 zcmX_m1yCJLur&b^+zIaP8r%}x-QD5h?i$?P-Sy&5aCe8`?s9R5`{#S_uQyda-8I!a zH9OnWU1!fkD9TG9!Q;V0KtLc#Ns21}(`WxgANJe7o^d9e5ds3y4j>|;C?z67qUdCA z4zMwUfRKzx(SXrRT3P7pd}Cf{_AJy4jDyYFq(-ls-ud`_i7BZ zp$K&`25n#f=69uY2KX!9NNrWR=5jG5)lV*{rbn>v_Fc{~59s4?!`a{BVC$d@;so)w zc-}`9I{;FufPs}RnuAgxNbKwjv>_x-06e0HPkVAsJSswdbTl}5g$tsPptggi{Zjzk z6WRTf<`^nKHi%n>R{B-lZ82$$hs1e4e|(5Upz#2Sh$G7(jz zZULXl;YNbJ%;)UW+5XoSF}HUfJVly%Y0^#oLrwre`zvrG2e=jUfL`4A@2`?b)47Q8 zS0fbr!>LFmW4b}##OgK=vszLjOoT;W7>JT^%sE;s|HWkHhiFoGd6x}Z5t*8nCN0qe%YSEXOe@>QHJ`qF z(%K+_WfJ16c1KPq3lDyT)1t0pMyTwQl!{!?&B#Q~JrEWLZQG_jCs z8GnHZhQd<(IR)AHlL^)Rp5viU;VeV47m6(&D!5n71XnAeG&7;s z0-ffIzc2;#?VPJor4}j8??;p(^@oM8rd8%qs`R(LCgBJ_5l6%FzE`Rx&RJIMxLcxu z>3LD;h>x;R)Of(#m*mR|Se#9RRt%8|;S4!w3dYc^**J316uf&>9xO!RBmEXMswf*M zPsA1bNkY4<1xeTO4mq=OZ2~H~bZ=x6Qu+9|5>{Rb51z~pe50o^VY|QL3EB2#H=x^g z^jVPcQ34aX3Q~&3bXLDFV{d@VcN*NiO4VV~0o#sOp#i~6L$p4H#2AsH^9m!tLiV`*8$oHL}^{bf)#YJwfq9Dc5=<_52(B8($PZ zSIYgn^;y-xq_B7W)H9 z@0G8^_CY}oq?(17|DD$-=I~V)#(n4N^vldoArGWbRA^GtAHjh*wev&Nc zfrDfblDvr#2PEjCfs+#5!gTS?qg19a%3<#T1Y+`J#|iQWoVO6CV)|qtMZ}WAZ{@gF zVX+b)ikx^sR7DGO1TLJ~znR1|3YRPyJBV#X6$-KE&MmoK5ea)CN@G}AWhTGg4RCAn zrv{ptu-TBR$Eq1MYeFqU_8eDfaxI5-4MlK6S?n@7z)0-VauespFb`_%WLdXVh22Humk|hF5hY8IYNCGaZ%mh;q?{(N zA+aMRB56V6i--${qCg)@Kahnd8%pvh7FNaukOPEHp(~;mMer8qmm6kpXCoKKXA><` ztkeLP3%oT;B@d$3;ngxNlrZu8W01GHw{*8mP6#h-?P2_)b47nue$$L4>yD#KbCd)+ zg*v6(O6%yXORpPvH?(Ma*Sj@8{0?mqeH^g{tWqq^?u`F?*aZ^{p50iul7kbuh_nfxPW=qpt45=ocu}tZMe?d z0mo-pBU&$-J(>urFtF>(3du@-|BiPMa*&^huL$#i>VWY8-hfHyrff7|6eP<4pRvyC&%f>Z9_Ma7-t8U6i$_n9T(BOZ`@Hit=#UhGh!tLj9=jK~bTy5NBSzp=8$}rc0dm~6r z2&xCV=eytWJKU$yXVkmS$M&~pXv}W}@Xz$Klq}2+20IIIj~KxiZgL)&rr!q<*3cFS zC*h|Nw}>_fjdA-CDv&&JtO)5~Hwj;ZFGr`%Ko&wM&Im~%gmB&ia61e;O(7<`O1p=< zglMm5v}jpmR}$J{N21FTj}o2|Q&Cn?RH7rIwGrK89Q(KJt{a~b-5A)i{DdMgQjx2e zcPz`a>R9V+a5+C5=R2g__igqm9OZ6Ea`AhZKJ|b5_Fn|&QDIP1%hqJPbI%Dl`t7Y! zh|f9zc<~srAT#X6*(LHn9 z>9b?!XZ+_G=z`zUziL?|+ti;a_-MRs9~ZLovt4<*3Q?6($`vNj?9qEhW2QG#->8@c z?G?5h{Iovz1`o91GG$Fud3i3C0Q4`gR|_Q{E9X7&8B@ zN7YJiuJcWJC@hKc!jfRz>I`n6upeVCu`98b0e{k6R;8Ah%V0D6-n%ZYJeb|HX{}D# zL-wq;AR18VFV}B4zYHG|PkcSZrrV&aq}!_9scpLkzO^oGBiByquhx%SEG;h$E+sB; zEeBVxcBl{|I5+?J++{KTX4KOWY2~zGUstcc+;qEn)Zr_1`?0*Xq<%?p#jy6`p>bBzxc%a* z(YkKxK6HiRP2t_~M0F*gcdVn)I^(ik>D8pKwB6~y_DR1T=D)qMGheq4arblRmxH3? zbsRC9f14xCk>ZBQ_DXW-t8}u;FQ#9xSWwr;SPCyq@LG1 zj;rRVmHis`Ey;Gr^P{G$%~t5v+^y-JkG+*bgRm=xZkaY6KZPg#CEta&<{q9;xBT_8 zx4%KxFlj_o0)Otp!6&Pr+wJWq247ik-G{+FsZEu*GGJFim&RxMtHccZX=6cmgV4k0 z#{0(ZOPqhmr)!13wEqmkYSeBTy*_Tw+Fk3)_9YY?)RO`BA0jUmK}-I&C*byaBWQk0 z05|h4_rCizXg@49j+GdMjE#sQI{^?Sul+1&`GfdYZ#px8a^yavGyyM#=KRC)VdkOwnAH@Bz~&9kK}EHet;UO2dKcUQw4 zB3>KfnFODRXpN2k-3VLukrTqUamih=1a&vZs51@Sl{E@Gh*{KE=%ad=St5V(XY)Jp zzCXm0bpk4q8m!$v0AgdNDP=Av2SNAGhJ}Fsf(P;SpY`RR2!6r;KU@3@4FuHxj6*^| zgaII+|G$s?KmA{e|0n-F^S>H0AM*b^&hnxD&({CXRjs_3=%0dfkkoR9fWV^sPrgVg zQ(Qwp{DhDa{iW*hfx@sNk$7;O9$ z$%-C~Fj2@m!gHL@lItxY47`Dh9HV)hud$8PUg2;E zFEH898ehP)wm9OVYzbEH?s!)p>;+)B)7leVAUB(eL_7q1AvTnq5)N^V1dD%zo?!f|ecv8= z+=utIIs7#uX^ogATOe@*sJ0&jZ9K(q>Q@w174-{i+& zajLm)r4kh@@x%qB-@LH7_)KwpYDbCksDb5ZpG@S2X+>!OIF8}tJHVAnYe`m0geiRi zw#@HL3YUdC6->e+_#lo_|3G&4`U}(Y%R^{xPa6FJas4)`(|4vnSl8f!f*x_dSAsQ; zuwJA1Y{h>dv@6=T;hGfAo0VS}=3DZhG^uewzq8Y5sRiD6_N?&d*LQEcSJ-(2-okzg z7Ih!*OjI%Bs!wOy?T)Xk3r^9ma^0K8eI~K^xeZ3`qjPvJzb!eC9`~ zr8DMaD1G3P1o>C=NduqgkgtlWI;x{R0brEq&&a4qy~d*{-A=blG5}tV`0Z1!qTMp} zhrI3_+P0L{mQuE89CA99Q)C}1Ne@lYc)Q7)+f%Fm9hcq9c#82~a!6`^d}gX6_OVaM z5~GbA_E1DeVXWT1fZR*C{jnK??RmANl4F1`_AT=c7^<*mI)}F%4uiT)=b4u{EwamjeiyR977v4@!UC6A&3{?(jT9|;xX>cl6VB~VUXC+6_S3I9I84Z~W zOdEwceo@_5o62RVDl925!W9A~9z8N@g?7U}$6X8n=t$~6Fg7U67~T`!(j9=5G<(5I zWj$Do+8z2`ULEx&Be=jQk|gR)$z2#?n&sSH;zwIkB7gwoqVHRd)a?w_p3pS$Dtv** zJ>2nNpr2yUi(vtmM+lK-f>VYqbt=`9hNio)dL5^zrB?q8=G_R0ccpjCOqyaH{Lw^9O zC=fhYIzO(Lh;P4F>GdaoJR@TWTNZYR-G&kK$HS$SkWaW6NRyu)OgNVd|NVYa=FXJf zS1NRjB-_{ENvVV$)@Ro9Ut$0&74&=)SDoS0DjjuDad@4*8z`>7Ex7E=9o4F>=98oC z@Hu%gBD1-$Q?_(2$%8Xoraza5Ko3I^5R!!?=*Y)h(>=9-On(^fzz7)J6`4Uvg22AT zrlZJSW7cHVc#TLC_-~rfGJ9e#zx2=oMty;VMtv>!qn$Q|AN1PZt$B>El_Akmq|-t{ z8dnn#PJeu~-(Oiow5b|#$oI{7AV33TqO$#EDIp5%lfB;*P>%O^)J6@s$4Qut8TQqB zd*-I?zkNg~`C?EaVd^kEFJuZb1WMxJh{?YW!tauUDG0kC(7G3V6O>rmHujQ6ctdJZ2+hk9^03TYgapAij0M;GPP7DkRr z#w<)v0$$Ur`(p|o$YAbp7A1_w-Q@dM7FBcl-Gb+y%1FVz$FFFOw zU$Jk&hNmE>n;CdpMR-!ge&t*TE#;eO22vQ>ZEkrqMoHpar~y7R-*l6w8Uee*GqtOO z2#l-0^(HkZ24%0T%dTzgS zSsh5OR0EO_#{JKifp(jjOLqIiQ#EE)%I;SeJQ}_&;|@!0j^rAyrQ^+Z3&Sy8X4Tf) z1zpz5)fpvOQ^Y87V;8kBomdZGmSZ9RPoGxL{ckJfO7&U%W|7t_r9kUNO^+X{xs7sM zS)BH6AV$xb2jbc7htq`>*NeY>IRZNSsSJKImqH#tZC!?!EUER1hGXiEe29#Kwh0$b(XF2~0imRyUycTaNoiDsWxb6SQUb_GoMTzFAW& zrJ}J~8^`ByozKCWqcD*=rKiGLDG}laB~O<1D$m%`G_#|<2p-4=KiaNuG_B}6^SNC) zZFaaji-aLN0<~MEPV6UIfLBwe_8lvwT}R#5v&VWG%>#}|@bnrrPIpH$&SmnsE>k~f zdIKcl2$rThY&jit-8Wrp#GDGVq^?ArKg6)NC6i6!IgJn&6xNaq$L!0MbcZjI1vEH};1fFt@?MWLLP=UO&V6 zV#^KA;d2W|li2`)`#_ySA4x)vT-!d4))n7EE&V|5{xiOhfiSwYxm_DIo1L40h();4 z#P+~6(a~{*1oUuYIJ(Ge2@A3_uQ)FhU18jyKkoKJM8`6*6sR>Bk=guJ3zi5=8e*+ zEE|i>h7y@g=NbNeyPiAbC);KaHnG=eGo$tKWk#n=uIa>ne{VzfZgbGS3kNZt-{SbGff>(ieI?b~)A? z{ARS(vf5)ITY`)D7>da%x3k?ax{P6BQrBBgX$R`GM3Ss^wAik6WO(UVLSZ_Z*xu(v3y=J6)?=)Q5};^2)GLYu)Zi$)`DF4*Lw|aCSo-nkZc$CkV>M2GNqFmI4zKGx zAqKn3@QmM;GxCk))dr8WJrG~;-OJ=`z}O}IAXi~ypN{^mJeDrmv7&Y`B56d9pO=A( zk`brnb(4ZMCyamo>{;~?umr`_Jqmzcm%M!Vog!{U|uJ>`L zl}?S2EN-Je)>>=IW|>vU)@8iq$}7Tmxl7h7mUbJecQm3$P?^MpEBbT#cH_X-j(K(9 zhV;YZIz=YyZ~L{TIag(gAEu**D+V6*BiI6c0}z zqS8O(DEs^2_nIWm1!hg|?U=-1jZ)-RxpVcaiV6juJ*1vbuMt@J_r2L8DLp}wcl;u4 z(&zvYN2ix58dCghOx`+2?MK~VB3_&F2VI#|2FF^-h(k@p`<26btDXD5xJT{hYx`H; zERor^<#PE~vtnySjH>y(H*vNS9x4mD*6IqIPCX$7I3_MWAl=4EQzNP}QH1J7){ zQl8voDAh^*l&HnWu?>9YQo`XtZRf}Wz1uge^E5SUBm7*V(sr$FSWF3iVj1?SdBA_XoYdgl zCFz^mFU#I%y|QdTI|xL9SucJLR5tt-{qPTA<4RFsg9BrQ^O-G!VL3)^*Az>FE%s@uUKihVi{9h% zxb4&`7;0VisiGSJePp{84Tb%LydILTfm=02I(&U0B^K3+r!_#Ra}#Z9I#Hg zwN1`_tfzB0Zg5MK#XLi@_#GR(AJb;qEpmN_KlUOF+Q1L3{`8@Q=e&|h1|a8{GX-)_ z@#aWH+&6nOw%Vp49}-eDtk#|V{Ty5r4mOy{EJ>G>m7YBBXBm_zRB!9po7lIds2+6Z zjT({TX1wn!?PfA*GH}l0gaU4hM|J;7Vpsd%hPpN~fK<*(?m=rs6~+bslz7cJ{7*!x z2C4u6N4y#RA=ffZh9k^$dL&VbIHC>)nSh1LX1zsLO#Q2>Y}4;IH&6@lW^5_L6=9s?%X#nkQkXToHM7Ca*mCq zQxtS*CyYn7O2%aT@IX5}vGo=$kQ-4gV znhOj2d3PU_iCB4WR|V|7wGv~~UyILCyZzoCLF{lge|HMR?U&RzYQz8tQ@#?F z1DAWR5$2_WA5B@UhJoib(~Uv9LWW0B{XHwM!^;ta`IU!Wr}U_n`aiGZ0;PY-^X{FU z(Q&zN=|;3Tb7;_iZ+zqu4ba4~Xia9SvbA+8!sOo6EsM>uxi64NHWNLCzAp^IBi5Lk9JO z6xqa*6+E#**b@J7ER}>}y5YTxkYn}SZHe8|xLV7wkjp6QT%T)A z)KOhoaU;Z5I4zbSvBnWG)YnXI42ivzS6%*3u^2L6B$WC)iuzG#r!^Blx_p$0=dn`> zuh~{DTt-i%SEqU^DeZm)B8+IulS;u7_&V_pW`NtdCWPq5WSswtmuNptH{=#aWT8k z_WSo!RITCldgro4-eX0&0)k3a>K5h8kRQuZce7NPcSRp1P#cvTYWE0G6e);AKQwK1`nks!3uv8@Te%rj=5 zg)UL%sQvnh4^|M!mGWi%!=0={K_C9dX-AlS@CY{VQJ7~yGQ2Ptj+gwL7vPRh0Rvm0r-EVcE%Y+jL@8+1az`^(dKB-ze3V7uNTR+N`+ zeU1fCJziZ~$PdFlX1ocKmkOTc`abeZeZgM4mC~r# z4J#io)R>EQ-(Ct0ZCz5TcC9LQy_|1uM1ywCmool1)idxq6?>|}_B%YXCiM+Fngz<} ztTN%GMtFn|%)>{W3vj?39|)1Lka3PT92_L95HbYKRwJ3j?GQ5R+YcUN4 zHcOH4IV^I{^?hA4y%o3%ibP12wUj*jo?CxY2(4m`*Hx~uqie}`YA+b*zfo&t?a$wZ zu^vz6T;|T<9mC7T#fN@FBBFsitETmw4UWTc>~g)X{c{_c#p7Hmx|vieDay>|_O98g z_zHJof3rs^j$!}jrBp78s15wZy~}M>sbeS#Yc<=YQJpB2$#bUZ@&rrpZRc3mQ_Nn4 zt{=$b0iQCt%+$JdLyzEi`n;ly>+7~jy{?Kyhq7Hql&aai6VKKqO(pJ7IJw#_LLO!_ z#P%s z+qzSHZ83-AT)9L|$3xxP+-tX5cBV3|Z_DL55}jKsf()SrJ+AA<6)(J^a66w)D8KvT z4?td95?f4Vn5|u+k=JT~Hs(al9`XFob+(l)4mXJX?oe!75xI@V0x0psH)GcR0d;Gy z5wT*zrVA0AF7K6TJtjPGw^RdrlV+Bzj#%v(9RytxoNZxP8ti3#ph2@vIhL z$V^S?KtO#f0CTIDlDJq92NMAQ_j4Y2Z94;_2qwUai`FjVY0OSv3HMnJoC%DWs{R_y zLd_p#EXDU551~DwObZ*R1cFWak>mvHf0R(IK=#;M{8ydrg;|B@73&k1Tr1moIe$jr zDf{hC2E{%7g z&8PN~+BsC9V8hyNTKbHY8m{M#3PoQTs6#R3`7)<^CLVI_xg81@OqZ7ydA57(&w!&tAQUO!y6r z6G|gCmgN=ZFgXV|NJ70&vT31LJkdWUA709b)Zd#}GbQiw;L02n1MouyVX@nneQ zw`4i0XZW48!aAjas*;t`#D?lu$jT8_tF-)nA5hbw2Q=qVXoL+2fdnfhl?KgUUbBj<=>KPtovvZEH5& z8$8U0YGOAh#?ET^IJTVU7G6XR7&;EB@t@r0P1B@Nh@@q`CQLB7!kq17p4-U$@GR}B2V}Y%{lC{^sE@|E?2>? zUtE?PC!{FY>+{7R12X{q;%0=nm9jASPLcw`z~&<&BgTX_BkbA->Dpive73)RRGwh{ z%VWNw{uFRBdl||ZM=A~lFu>X`r3i`>z)})2G7aCl@uj@oCZU{kqs^?O+^YMgt)yfm zaayDHdEk9yB(Z{wLAGa8Nx}iA=wzfkWhPd5%VZH37-94 zfg2`Yz|Baq40&A<(}xn|;huUa$1jj~_yPvW_@Aa=!3iQ+@6rd&@7M}vrn6I|yEG_ka3>I)bh#*vC%3>3<8_qf9$z6}{5ds)21W`5bFLTw3U-8Hx` z?A?!z6KlDH#R*MTH^ytoQHlMx6rszEgFA~9Tt=G!ELdi}KCnyMLQ(3JnM^7$*!Sr{J_d8ti z8Q!0PnJT>$b}qgCLsA}oki!H4bR>_M+=m10lgNbiNX1`GBYKMy?x0ab;?qY^K!1FF z1|gDU1}4ariHlL?6VtuJ7-0GiHQs^21?H=THB5H zSaE%Iym0@*Dvt{7AVZtQ=Q?6tqeKd7-&>3?CX?_(K0jnd%Y{)>8X zobkAO48x5iz2o;)+(0}a0Uje9BBS(l*95Zn`>ldha{w58abLz$Sz(LL$vzyhk^Jg4 zZMnmhC{>u9BEcJ>UJQFJc*DpYC`W)aDBVSFO6ni?6dndGg7zyyqK2HB`$h3MGK-ob z-H!TU>xN4_;$_$!k-4dd1#Ki7nzn*`Z^kDvZqR6JxEzo?s?z@Rlm2-L5Mp!jy752c zRpDtR%!WvrBgEK0_9|}$VL51!pcXUjXnrPSCr@gOm5`XPxF|TxRDdW74(Dr>Y*x~c zrw|h(n8EA~Q<#@{?r}yyo+R~J`hX-cG=bwrCFrlVrDb|JyNJ;mQHu#&l(#NxuNu_S zl7*>GYss{c6?WKTyD_%=PW~73q}Gt-{Q2YY0|l>9I2^0teqXhei;J@Wu5SY@TH|Om z(aenis3I$Z*Q=GK)e&UZ^ACvPoLl>bzkHd%mc|-AmmVl#m-}N~BXF+-mZBC{k%< zSw119a;d}Glbb3x>O2uSdcGx|=N-k+PYLOOi{%31%iV;_it0RFrOhtL5K!dIb_-a^ zbidU%2`&G4yj-`xL;)Iq6v=v5*Eh_}8~X$_cYO&%!Tr3%lq>zXyNg?mr&`6cgd|pp zvCsF-A6kg(GI5yh+HNZ2hL){sc!{4;-h%_8{MdGy; zk~e2GU78`o%%Gjw6$xwqWUCqrv;`lOy7RmQ`9zl(0f|i}1zcF(uHrTi#m5qP#fCN} z3SSU5b+xS0N%*?|dVURAeI0yXU91-io3NP4HG@9GeRzLB`3_&3ZkO00yJU%9I-Om! z^6N%`ZA4k`u+=nGL-mBki@8w?|1K;XdGy=0PD6UVQWy%i;^&xxuQJRv{UvHYGGUds z*&4li*_VdW=NoT9JJConi3{R1dI8>mVZh}ddPaGDo@N0t!e03dogrb!wvM}LzNw2b zeL}xHN*VYK^w}b(W9E^28$_;sB239(j{vPBqvQ#Mx${!>?6Imjyw4 zxdr?`xCai9mf8OP8fo8J=pQ;!DhlBuM8qzdZ#UjWHL)n3P-=C`aAXw+Qu`~*S9y9~ z`qw{jsW4!Y>@hG(GGIoMD58>*vBV1oKd*-Ahi>BG-M6XU>V-aU z;--ob)d~Eb2mI?)I5*AILzT8hc*>lcr_91CI@jy$f>tfhXq<}-z^`1BY~Navr8%_i z*4pN5WQ}~@NjH1?%l}9`ADaJ`))L#9C!aPOeg_Hw%C8Uhg<{WeD%OkVd=hPF-PPK0 zQdl@S6+IZOx2I{yOB1?HtSS`zKDL?tjfY<)UI+Z<5qiA(Cf{rlaW#1hK>lD5Wn!!A zigobQ1NjjTu8Pnpx6;E&5%^*&F@uno+lJxcaz&$5yCaJSD|8@*UZcYz*Khe_y$9S` z72lq9)zM0YOz`qDg8o89aX-HfHa@RKTOwb=9=}CCj2MmmP=!pOwMV=gm5O526tibC zwo8R_{KnTfcm0+IUX+9@rZ z`>TS-WR1{)X!MiL4m9{ChMKXxm5@iek**LQJD)*oB*1fWioN7<6mK!|P?XlY!aK9i zdW~AAoE>+z%9bHJ&}(q@iSyDox764PQ*{#3UE>bz1B=?^xb`3>jC_$$#D(&;QP)dI@|Jo&3(iXCtV;Rpidk^{rKKISstW2TkdnA`+{`DZ9-+?S&66LEUG zdRMOrR|w+ufpZSCzEBKfUbEruMLtlsiL1%rF;luEGq)2BtshP4TwFI$-iR%4nHln1 z-gQQ3+5L!{z<5NAX$`3sw}0SCY3r)^aIqXGWurJP>qi|g-chAXlfYLUWn~6H(O;vn z+q9ndk`vo5R+>Z@bevRp&$EebNH(E%uZSugfBNqNb$Rxs44#x%63u5+fRV^&t~)Am zK3Kblv*zWhsnlOAHI6$iO`Z)blJ8v&{b4tCxS(#^RtKXzY<7YuS5nr{)PISpbMvmj zZaPGcoC92ap>ldOnQQH)T`vt=6pJgM9SNkgYaYz@eKWOEZ3&OH{L!b5AwE$bpt4e}VS`#OmiqKw#C13vVOX;7AB3Z_ z8|LI>B~^4c#UUC_WefFFtKVCz-0yuh<_jkF$o_o#6si?N2WE8aC@H`E`QpJiA_JZ` zq=)-ouXXl)U%@y7{EJEY5wBT&xjzbiAQ*6ivmQ8FJJ81u{11|-40Ex}cZ0`_1lWkY zghjKc$00boDDM8c|A-6VX_aC%-b;nZ7%)b6skIaJ*1Fr!YP87W@HxFRl~XLLlE1zu z=AOlbC|X)5G<+FPBX%_SSkstFU&^wV<&tAh#cYc@O2>q8HlI-vbe5eM5y*lTT!GD4+}rAhv6b83HF1UT!VHzp=j&a?J!z7 zwafl>6$gHf+vMuLam6oKt~Ic@>-vgUFzWx01zzny!?eW^yVgev*+Z!)Fr!U< z35K(lS;MLJVCJ970+|xh+`u`qZHvkj!oWDPD=cHXFPIORYd@6YFQQ4Kdsz&3Qi+KT zle(4)?vKbh7-$goygbxSRG&yb&48bjo;oIHYcwRHr8F#K0N7qho66a~cjwzIhtymH zW_j=LXq3Qjk43Xhkdbb#XP^;J`tR-Ls^a6@Z<<>GBSb7oD3qT7U18%P7H(ToQpUpP zmOIpqTy;uYY8j-B4RxS?0zHjg=Y#q~w#j9+HHMy#kES!HeQk4Rr_7p!=**nS@ZYJ0 zOO7X#&vj{MBZfV$l=)x|A#r^#IdQofo{NRr+A5k_hgz%TU2*PYrv2eKHBAhRE)?re zb+0={^I%}wQFS_rKskq~I2sdtza(zx8V%1=4#9HxhCX$;v}sEdP`r$(p=n#z^C}5v z{7-u=&PmIk*9l@er`vM%*Hs2)lrQSk7UqZCpgxl7IpW zqieq^P2RB?Mb3&kkud2kl2Xc%+h8i!QZs zB&p1iHPPg1=Gl&$Xwq&zRk_qZW*7xaAwJPL%ED{n0_3eH&(xp&+z{+`U3Xjh668IU zdH(3gwdql0Bj)a+!2eKex3z9IJjXh`rTCe=Dm@VQ#Z%I+R{f*v1V2ebzRA(2I`Ded z-vBctog4wW31)QoYmK-$59@*@BogC+p7D>0B`!NB7>ZD~OcuH;qg-krm6IYU(y10x zN%u%6+Del(*izF4L~;H|g5KXZ%c1_(#+mD|S5&?`+@wDLO;|O#)?Un@(+KHMggEBs zq=C1~dBG2Uv>a7tFT-s=dSvs<{d}dkaDSokxr%D0$GKwYsu0jN0BVhg%Lpq_J{FMH+Ut4*mSg&)%T}v5G7+F zlQaW$B~g#bw=4=JjD3v$=Jo;E)xxhz@9&Q@ULLtIg}?x)sp8rWfa$pcVtlAa97o7> zjuv7mT*+QSweGW zwYD=9g?;*o#>vR7QXizc{2oihM|NXYrgsvA9z1Mli!_&7rs2uVejyh(mV$?4#WYu< zq8cs}7+cA1xCH}cGMzM+JC|dYMBb6_Q9I~>nB-VSYh6R5g8IF^|F_Pb;bg2b^uv{2 zjgnKuluoN;#nP{v`tJyLh1bSsBDpT=7q3P!iD7&y?NsPrwLHZ_Uc}yue)3l83GXD* z^S0RrwA*Gh4J{I?RA?@4`y1xz@mK=J!ReUxE>0UW<0-@&tlo-uEH&eXw(0L$LZ+g^ zR=-><0O9yZ@vzt1Md!q^C#gSV>y3YI7)<)i&;P2BpibHTRDb5@E!ey}-n>P%#0flY zcjC^;*L^M0vHZJ z0GMIT7%uWrz_YB4UqyaNCUWd2*Jbzlu!PWKtpqkD0c^@){*;lzVJ*zLVyH^p<*3Fw$Ms>kGG z=p-2~K;`v)P1Yd~78G~@0M3>pD>T+5`|XPU-3v5d!80_2BDj+W`7~?c^S^a6Ps>R*nYzW4m`Ru_dGZJ2%Z5f z3}7|t+Pf!)7wG#PQg)i(*lyMjWFNQCww`ZI$-CttI#btV1fKwj(7)3U1}r7?axSht)QHM?r7P$|qY0YBZkjeW#4ONC#*66c78^y3FG7*cBgC!&zuasr%BcSiG?6|Q{6&NmB-LptrAS*FTL6wwef4Va|A9hW5!uW|TM zSdp=|{k!UeTR1K{a*YynSW%eS7_X z0mUCW;Kxs9wTq?Fw9aK+sI=wWiZq}vCkvvh~S>bY>`2DcM01(~56}GT(qOx#yJmlJD6(0Ak+RYR%L1hSAv|0ReyYt6wz{ z^b_~oW7oPfzfAJT6b`(j*kM$&@R)CSCwXf=j7V)t87ndKGm)T1MXtjz>-ee)^EApz zdUQ6~hxr_PQCWuJ8pzl~XJNoQ(*mxJRLwNc)V8+JlgFZYct58#^r;-YJvPB-`grUr zOck{dF&6l@t00ta#6RKG4Q2+ArHFGysG;-^IE6DOkEz$5JFB*k5-G3#ls}d?EyYE+ zVkP4{6E)W>^Cc%tM*Y!Am zl2O3K*m8KcBG98ZYHE`_^F^b}b~Z*5)&#lAg2qFmmlinVQa+KP>dW8ftr0k>tYxEm zh~PsR(VoT!!u8t;Ni95u@#McVU!nSmk0q3* z%xGmDiRDIYLB_)SFZ(727ih(YI>rZZlriZeHxcw?>u(KgS=)~XNSuiysdwrVa1pbQ zZ=ZsG>()O>M6j_94crne zEY`*TB(delnU7pT{*0pR&E+y+mM@qOvcravEG}K-m+wmEpP%-Hd>8T$lywE#|3dJc z5A$Kge}++WtCUnIdgmdQ`CgQ$JZvULblSVV*;E9GiPwGCfsjX*u?zVZ@c&Vwzhq(p zJ7phR?9K5=$5`_=)tv|MV{aURL%4f{<-)-z)A4v5yY=O z-aivsz<)ptILyrUB1%7m^j9R!WjcxT)iaif&2AdR&~^E02cW!Hov=KQyibd5e)R-o zy$boOc?$SX=_QZu;%Am->I1G5E+F%(Cm`$fn?iqiqHnf{R5$ybbsH*jgM(M}^w9dv z%BvA3E4M~Jd*|no{m!~Wq=S(WQfjIrcx+oh$ zV!cAn{&f}VpF3`6GswD~ElaVUdQ4|Q*(kvdAiq@N;ga7ES^HmT{jC9VQ4*Raskz2- zxvx$ z>0a$$-l{|LwtlSjVNTgpXrlrrlQ=BTCV-CGRC2NSXSWrXMQ>Ny4}A7>w)upE4O?{n zmR(M*bac!*xTHznAOF;vHl?J^qFe+*7UdCTRO$kPFm#2I&X^Q1P^B~vz}0` zj)rx}CfdlZrQbkAx%I!0{LgAf7Sbcj|6)s2Y@Zu`-Syh7^#`tT6P?@D7P((M@e4ZzRHhv_lP6DhpOvAq zxw0j6Oj`3^y?VP_<#)08U#snkPfwky1N95s6m6X>lf{kI@_X`$C*0>Be8>(!G-;zl z5p_ z%FD}b&pC%dumg4K%t=>KUa1{q+7%`o+CyUe`JeD>A@*ZI!M8s8Pai^G9%UQW88vc>66()> z*;T7o+uqS}>SQ>gX628_3H85_{MkRufo{_K-LHMkwpx}~l)GD}OqM;TCF%r1Vh!M* z4d)e=sl2m$=nJ2>Q5e%B!-fsBy}jT2tM3WVT;`%?n&1lOC8jIj*_Zu%W|rIC`*cghvwLC=#2^v98C%X_XV&8_Sk1_Sv%k zvP5SJTg!|HYov#?FZGo6t#W>}qxQnGqvVW6>%$tEYpQGPLgH_l#M6F;dBY{!9RJ%{sIvZ(=q%p zhc^`I2Y>cmW3>W9YKk0-7^X+XGRpEFY4vrj)` zH0cxIxBmRo@Cf1I0(dsq++IWuFqc?=b$dR6u#!?I6Fh9G1jH*7pJ}~dbv)mD?>)C_ z&1xMl8)%24+l9@}B!%TcOqXDAixOaeXyfLMCbD2lh~ru16&12IG}!Daag^(%2#M3J zYzVKYsI-w8D{^dBA2n*U%&=5w6ZUSmKvotYNC=jFp|fOWrFH94*G(eNUTt7!e=rPo z!womsMr_PT?h^yUq+HSrBXXq62!#0qJ&Ov>o?Nym61HQL^9;YuH%{wdu+AtD_sTpOD==2SCst^S+ z$wJt&_m-+`TWv3Ow;tUkD(@EVs$Eyr0de`9IsuOQZP!S~K4$DhZPR8G&UAvOgUSPo z3E-or%SFbTVvGwp4B2JM)fXzG+YDFzOXLUJHI|B3BSMi)c>q;sX}e@XYF<`y8oTA_4$ zZQHutPFhSDKhcOewh>#s0owb_&Ia0IsoWdHY1n?a&G{>g<$w z%&7~?^%vj!3p>TZ=@zUcV#8^(todPltfz>7v^u5@>({#{G-rg+ZoJ_}I}W&C;T(A0 z;sP07000vfNklVS;#RY$Do{Yv85Y)v0HEwn{ydbX3v@JUQ~H7O=Bm2R9<>nu*1kU^{J_D>eEx*55E5c*GUoq3Z;NLK~S3bgwpGk z5Pllyfvj`HNd`U;mP8{yN9JPKDtiC@_q%`kr+><%8`4+0opmbff%_k@=$zi7<40mL z^Qs#8(Uo8=m%Kyw$eBydp(3(+OLm-IsTMOHcF#TkoDq3ziJ(~f!!9cp`xt@$ z^t<0RaZ6*Eok2yUo1|WS#E9#4)NQN#uEG&LISw{SJ?6hik;15gXgFxlAdSSm%-y+L zXLpn4#!rfXu-E#~LtoHHTB$j68xe?zS|h+fSvh=N^JMl7-#Y15H(}C56H(t5k?q~H zD~ofRFZR$<1P7BV%FC_H%u$gJ5Z7p?zXu-#=E}$2fuHlBU(!ABIR={ zgbX3}>)*#T*fh@>F(@M$7Ba{mZD&QfqzRcH;jnlUDV$0)O_(xeip^z@r~~@m_r7OS zjDP#tPsF@)>|i=l3kJQh#DBXwz{cwMkQ&yjC+&O6Y9)fQ743`kNFfjM870NXyys@7n57Jew}r4<7MNDgZ%UkboxCt zT`1Qmidc*^Fk6yWd3l9dhpaFEq$W!ELg+u{AKys{WAAhk_1B0Rdb*fee1}v)bo%%& zpN{~ov}tH`#WYz~V3NK@4UDMA17c}N=1y2#$B+~Ut~q1cp;HIfes7uPvqO#WIa`Q@ zK<0p)0Y%)!%qCjc13Dv#UD`eseQa*Ve5#3#lra*N$nqX;j=Ob|=+<0>$h>yIfc{ED z=Qw+GmH4{XG|#oQMJ7sd&q}u*-CRe_-4I#ZX(Zn(5i%tcs!Q3MnTb$Le9$qp*LhXM zEb?Uz2*dX2J0R(q(a2(h=9vr;j9ao<&XG)OZNd^w zS%)&4V-J;-L0ZgXO}i@baFax3tRaBY4w_q2C+67RoC~&}H0KvSn3oqd@{Ag3C(c7@ ze|rKW;f}PDaB~CnherOG>M~88DjgUW9MEfUZ3&0zF;d1TGfD^Y)uN_e8fkcb}` z@NFc1^VDcDVl4~&5ng(N{N z(!^F02jw9mZvc)pKIKDJ0sqv4P63e}Kk7?eiOc$Dwn$1=X=#00(Ob4`VbVi)DNXoe zv~Mqlqa5UiWJ5aCpYL-bTH^CptV8W3CE(s4>toFk{G2``BXnvzI$Lj&RB=`QYH+M@ z%bW6pCcQC~W_3}r{viYZczo&G94W0>82~vC{gF!ZX3bT8XHzQ7oSM!+x`ka5)2RkZ~ zC@)`|_ox3@h8TnIORc{sRKQ57?JL(%@WwT zL9=`E#3_xC$JE<4vxRQvd@wi<_M58pXG$97Rz+TFOy$p)4%6^AHy;FmRZ9fuJcTbEcR-1P@_Tf3Ftt2fF#L z);!)blk!tO5EzV3C*(&WUp6|XDrt2i^Fye`k|+dZaTqyaDC%z=m=rMJ89eX}{@H+n z;tflX{_*7?4rM{Xv~2k@nP^#OagNubT|a5{M;SwWcB9f+lK)w?6ZNN4pdh}yfjU$! zPfrT7D)@TDd~@G?2~G-lOsV|CKmO2yF)7QvLWr$*f9zxKo_juF_(z$DbcU@f^5-J3 z;}Yjla3SIwtsIxZJn0!2q3N{Qc!L!Tq!F~-Y@i4wwY=)If9f0H>MOshb_MHiHj^R} zE&n9X#9`hk`-M_wd}Dl-%X+)`My=22*SS+yhtg-MMpo?HtkryHhQtn*L9jJDRJLZH zeDVqR#v5;#D8s1GNumuRC`&5X?O{!G!9u06beSod&_s?JIZC5xv6*iHhLM|j)O*rk z(ts#?n0vnV>gzVYVKin={PfdLyCs_AqF}*FJa@Ysly-KeL~%^0pdnnhZmrE*Y($bs zi^v)u6gF#QKL|TU5Ha?GiT8`2KVh0vOtU~#7-(ZXag%I0&7C_(hP!uJ{Z~k&#{>%^ zTq_X?$|QPsn1T^=%jQkCNddDXi2Q6~z~1gFufAfojEQ&s^}}uCV;O6$nCGWI`H5iW znnstAet<+lHf1bWFyB4)?~e;7OH7%9=nS(@ed)8`}WZ~ z_fpvsnr<3*I)Wh*hneTDkqAj!Voe|0RNPBLCxX4J1q&9*lK$Hod8Nl+cqLk$x;VA-lw3-Vq+gN72Yej>rDvF)08Sx>mZ*15+wHckFWi{OJD?ymSH5 z?(+Wm&mXtWam<*}h7+U(I?uC`O4hDjYc#URZfsbpwybs&rTE0|F#|xRlxcabH1Wrm zDwaReeSG^)O0ZX;{{|bJw6tS#&$M~@^;v0in17)-sE#|y=okbvL z&0YL zZ2hvlqFkeY#_EIWkNsGPebc5*THvp69W=UQ74N~%-7jP`GJ;^vilo9B*|AbCaa96$ z<+PTPiaJC1gHe2ySJ?cnO47t2m7`3S0O?Hbl<{kN{NclgX-?i&3XZPskuN=LB0O_- zZ^gk|gU1FC#>zUWQ)D#Y76O=xDFGecO}E@+OAm}PJ*5a^3Fywd?lMJRXDwr)FBmy; zq{`JvcxqvIBmQl--{A(R{d)H`-${pF#1R(H4l?df3}vcwnKSWQCTUGc7V#$-Zd2iGNtOS zYfgB`woji*O|eRBKv2(S#bNqI`9Ax=1JZNorZ>Scw;m!I9H40!H@#7c_~hpw>l4Ex zWT<9d@D-}RhG)tV9$zH;NSuiy*IQqaXAL43ETk7-K9G6fS&cx9g4k1K-W4B2JZm8N znaH3oPmDgU*}kz1h16hMD4Rd3cQm=oa~Ub=HJO)s80L@4Nf<@Srw8=nK3??p${7HS z42(k76Y7(BBQ}o$#aU59GA72Jk+{h`+rm|T#4{7+$VzEiIw2OxBU&=5BQ|5@kdYA_ z*qm6sIe8T|Vos~_3dV}51Y?g4p;0F4z{o(F5C@3 zP!cf!|Kc}`e?Z9}B?je1WF#|2Pxfw8?;7a<*h9>Hx2!eB-8ZzQpXVyFJt491)$l6= zmpBtgUP|15s=2!gXjM}FbWC*I)ZHu3{JlJAwz;-~GUY!^w_sJ;nlgQoaCv3jDMWPn(DTDY*qD}}Ekfus>xrE8%DU5(qRWTY z2Sb%oCh_q3(Dq$VCcs&*tUCcn96t2?Fv2IaKo=y23RGt)SiBH9(-r>4;y+>3 zhASMak%35@i6cRucQoWrXbg+*_lr@;KaiRb6-KB8b&10xG5KxCAAlGZ-#$|ASCWvy z7xFh&7cM?H{7V*~iIO73Px*L%LVWQ;Av3@xMjO z`VlbrgwHg*kbjB`LtXMyz<+`YIsS9gpUtlQEENepB?-GH#PE>INCpwXEJz=5nj8FM z3}Z+6#c%)4{(3|g@&^wE{1@miGM1hrZ|N-cpW?ossr3 zd5kHw_^*ic0LFlpMjDsC!0*P%NjyoyI}}D_RBB1*kx9e46!oBZSyvOH6GdT#{0sQM zR`l1Rw1wTh%SjzSrqkVo(LEvA#No-=2={#SAILrfV(X}$%za`>lJJTG4AvZJEEa}2 z6~9X-Mp$t79OHFrq!l6k7~u=e{vG_2H2I>y)^kubF<8)b89S>EcS$AM5U%A!d!}9BWykx=4SKFFYlN{1upiFgo==DGghE z{Ul|T!x+_j_z)Y!LUh)~Y*fpO#MwR0F|w%7ep!3OP@d01@-ePuCkVJs_nU;``Zvjo zD{UQ&>RB91&Hp569J?e*yo{$+e|F_K$bbrV&ItQE>WLF;X8th{mxtmQmo4 zy}i7nA^uB6-wacvxy3}%H(~5+J)^B?wy|3a;2~Ru$Ys<*pW#tvX;B7(0wBf5>3_rc z8S+oVyvw4*_r5o;RkuJiN^-PV1KMpHr!02z<> z;lGqovuS7In|JQ%q?$xBpaT9c4*$7~Y6NVfvBzoyT%Mpj(}3QY@c<}(+M6G|)Y}LD7;FJP z&`!B<(={9Rxu3IC`&c>kkiXN&;$#x=&WuM$FATn6IDh~AO+t?a`T**GHS=ftAt!Rq z=nU|QlgC|49my-z_P=CnozQnR^Y=}1{`N2AUqAnOMh*1Me3QMBHt&=OS*lI(@oXs3 zIeAf~lM!r4N8n_iDtju~(9GUin29Z0>|H&r{kH6z<%k*E%n%JYrP7qWv+;?1mC2t= zYB|-w&EZVaD=I2E9U?7s5!O`(SaM(ke2LDOo;|0dhuT9-{cZ29I%(RU@&KNcCg7i(IHHGUQg}^|!w9P1j8)O|U@t!yo)W)+Bn_+0Op5gpptWqdI{7vdaG4bI;mo8`=V01(yI9 zNlC~-3swW@ydU|>m)&|D16;Uhu~`h{ur_wC&S)UOvIX@ytvX@|d(@~=4s$lFBFvaE z!_4zwxudHTG&kRJifPa`8^C>_4!9U2F#sGKw9e0`q!~`yNB(AhYu6_Gc d?90jj{{xckez6U;3#b49002ovPDHLkV1hPO^n3sS literal 0 HcmV?d00001 diff --git a/ppstructure/README.md b/ppstructure/README.md index 2833bd07..0303fcb4 100644 --- a/ppstructure/README.md +++ b/ppstructure/README.md @@ -103,15 +103,25 @@ Table OCR converts table image into excel documents, which include the detection Use the following commands to complete the inference. ```python -python3 table/predict_system.py --det_model_dir=path/to/det_model_dir --rec_model_dir=path/to/rec_model_dir --table_model_dir=path/to/table_model_dir --image_dir=../doc/table/1.png --rec_char_dict_path=../ppocr/utils/dict/table_dict.txt --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt --rec_char_type=EN --det_limit_side_len=736 --det_limit_type=min --output=../output/table --vis_font_path=../doc/fonts/simfang.ttf +cd PaddleOCR/ppstructure + +# download model +mkdir inference && cd inference +# Download the detection model of the ultra-lightweight Chinese OCR model and uncompress it +wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_ppocr_mobile_v2.0_det_infer.tar +# Download the recognition model of the ultra-lightweight Chinese OCR model and uncompress it +wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar +# Download the table structure model of the ultra-lightweight Chinese OCR model and uncompress it +wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_structure_infer.tar && tar xf en_ppocr_mobile_v2.0_table_structure_infer.tar +cd .. + +python3 table/predict_system.py --det_model_dir=inference/ch_ppocr_mobile_v2.0_det_infer --rec_model_dir=inference/ch_ppocr_mobile_v2.0_rec_infer --table_model_dir=inference/en_ppocr_mobile_v2.0_table_structure_infer --image_dir=../doc/table/1.png --rec_char_dict_path=../ppocr/utils/ppocr_keys_v1.txt --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt --rec_char_type=ch --det_limit_side_len=736 --det_limit_type=min --output=../output/table --vis_font_path=../doc/fonts/simfang.ttf ``` -After running, each image will have a directory with the same name under the directory specified in the output field. Each table in the picture will be stored as an excel, and the excel file name will be the coordinates of the table in the image. +After running, each image will have a directory with the same name under the directory specified in the output field. Each table in the picture will be stored as an excel and figure area will be cropped and saved, the excel and image file name will be the coordinates of the table in the image. **Model List** |model name|description|config|model size|download| | --- | --- | --- | --- | --- | -|en_ppocr_mobile_v2.0_table_det|Text detection in English table scene|[ch_det_mv3_db_v2.0.yml](../configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml)| 4.7M |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_det_infer.tar) | -|en_ppocr_mobile_v2.0_table_rec|Text recognition in English table scene|[rec_chinese_lite_train_v2.0.yml](..//configs/rec/rec_mv3_none_bilstm_ctc.yml)|6.9M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_rec_infer.tar) | |en_ppocr_mobile_v2.0_table_structure|Table structure prediction for English table scenarios|[table_mv3.yml](../configs/table/table_mv3.yml)|18.6M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_structure_infer.tar) | \ No newline at end of file diff --git a/ppstructure/README_ch.md b/ppstructure/README_ch.md index 4f961cfc..709757d5 100644 --- a/ppstructure/README_ch.md +++ b/ppstructure/README_ch.md @@ -97,7 +97,7 @@ dict 里各个字段说明如下 版面分析对文档数据进行区域分类,其中包括版面分析工具的Python脚本使用、提取指定类别检测框、性能指标以及自定义训练版面分析模型,详细内容可以参考[文档](layout/README.md)。 -### 2.2 表格识别 +### 2.2 表格结构化 Table OCR将表格图片转换为excel文档,其中包含对于表格文本的检测和识别以及对于表格结构和单元格坐标的预测,详细说明参考[文档](table/README_ch.md) @@ -106,14 +106,24 @@ Table OCR将表格图片转换为excel文档,其中包含对于表格文本的 使用如下命令即可完成预测引擎的推理 ```python -python3 table/predict_system.py --det_model_dir=path/to/det_model_dir --rec_model_dir=path/to/rec_model_dir --table_model_dir=path/to/table_model_dir --image_dir=../doc/table/1.png --rec_char_dict_path=../ppocr/utils/dict/table_dict.txt --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt --rec_char_type=EN --det_limit_side_len=736 --det_limit_type=min --output=../output/table --vis_font_path=../doc/fonts/simfang.ttf +cd PaddleOCR/ppstructure + +# 下载模型 +mkdir inference && cd inference +# 下载超轻量级中文OCR模型的检测模型并解压 +wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_ppocr_mobile_v2.0_det_infer.tar +# 下载超轻量级中文OCR模型的识别模型并解压 +wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar +# 下载超轻量级英文表格英寸模型并解压 +wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_structure_infer.tar && tar xf en_ppocr_mobile_v2.0_table_structure_infer.tar +cd .. + +python3 table/predict_system.py --det_model_dir=inference/ch_ppocr_mobile_v2.0_det_infer --rec_model_dir=inference/ch_ppocr_mobile_v2.0_rec_infer --table_model_dir=inference/en_ppocr_mobile_v2.0_table_structure_infer --image_dir=../doc/table/1.png --rec_char_dict_path=../ppocr/utils/ppocr_keys_v1.txt --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt --rec_char_type=ch --det_limit_side_len=736 --det_limit_type=min --output=../output/table --vis_font_path=../doc/fonts/simfang.ttf ``` -运行完成后,每张图片会output字段指定的目录下有一个同名目录,图片里的每个表格会存储为一个excel,excel文件名为表格在图片里的坐标。 +运行完成后,每张图片会在`output`字段指定的目录下有一个同名目录,图片里的每个表格会存储为一个excel,图片区域会被裁剪之后保存下来,excel文件和图片名名为表格在图片里的坐标。 **Model List** |模型名称|模型简介|配置文件|推理模型大小|下载地址| | --- | --- | --- | --- | --- | -|en_ppocr_mobile_v2.0_table_det|英文表格场景的文字检测|[ch_det_mv3_db_v2.0.yml](../configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml)| 4.7M |[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_det_infer.tar) | -|en_ppocr_mobile_v2.0_table_rec|英文表格场景的文字识别|[rec_chinese_lite_train_v2.0.yml](../configs/rec/rec_mv3_none_bilstm_ctc.yml)|6.9M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_rec_infer.tar) | |en_ppocr_mobile_v2.0_table_structure|英文表格场景的表格结构预测|[table_mv3.yml](../configs/table/table_mv3.yml)|18.6M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_structure_infer.tar) | \ No newline at end of file diff --git a/ppstructure/table/README.md b/ppstructure/table/README.md index c538db27..4c3d789a 100644 --- a/ppstructure/table/README.md +++ b/ppstructure/table/README.md @@ -17,8 +17,26 @@ The table ocr flow chart is as follows ## 2. How to use +### 2.1 quick start -### 2.1 Train +```python +cd PaddleOCR/ppstructure + +# download model +mkdir inference && cd inference +# Download the detection model of the ultra-lightweight Chinese OCR model and uncompress it +wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_ppocr_mobile_v2.0_det_infer.tar +# Download the recognition model of the ultra-lightweight Chinese OCR model and uncompress it +wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar +# Download the table structure model of the ultra-lightweight Chinese OCR model and uncompress it +wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_structure_infer.tar && tar xf en_ppocr_mobile_v2.0_table_structure_infer.tar +cd .. + +python3 table/predict_table.py --det_model_dir=inference/ch_ppocr_mobile_v2.0_det_infer --rec_model_dir=inference/ch_ppocr_mobile_v2.0_rec_infer --table_model_dir=inference/en_ppocr_mobile_v2.0_table_structure_infer --image_dir=../doc/table/table.jpg --rec_char_dict_path=../ppocr/utils/ppocr_keys_v1.txt --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt --rec_char_type=ch --det_limit_side_len=736 --det_limit_type=min --output ../output/table +``` +After running, the excel sheet of each picture will be saved in the directory specified by the output field + +### 2.2 Train In this chapter, we only introduce the training of the table structure model, For model training of [text detection](../../doc/doc_en/detection_en.md) and [text recognition](../../doc/doc_en/recognition_en.md), please refer to the corresponding documents @@ -48,7 +66,7 @@ python3 tools/train.py -c configs/table/table_mv3.yml -o Global.checkpoints=./yo **Note**: The priority of `Global.checkpoints` is higher than that of `Global.pretrain_weights`, that is, when two parameters are specified at the same time, the model specified by `Global.checkpoints` will be loaded first. If the model path specified by `Global.checkpoints` is wrong, the one specified by `Global.pretrain_weights` will be loaded. -### 2.2 Eval +### 2.3 Eval The table uses TEDS (Tree-Edit-Distance-based Similarity) as the evaluation metric of the model. Before the model evaluation, the three models in the pipeline need to be exported as inference models (we have provided them), and the gt for evaluation needs to be prepared. Examples of gt are as follows: ```json @@ -70,7 +88,7 @@ python3 table/eval_table.py --det_model_dir=path/to/det_model_dir --rec_model_di ``` -### 2.3 Inference +### 2.4 Inference ```python cd PaddleOCR/ppstructure diff --git a/ppstructure/table/README_ch.md b/ppstructure/table/README_ch.md index 5981dab4..a1bd2442 100644 --- a/ppstructure/table/README_ch.md +++ b/ppstructure/table/README_ch.md @@ -1,6 +1,6 @@ -# Table OCR +# 表格结构化 -## 1. Table OCR pineline +## 1. 表格结构化 pineline 表格的ocr主要包含三个模型 1. 单行文本检测-DB 2. 单行文本识别-CRNN @@ -19,7 +19,26 @@ ## 2. 使用 -### 2.1 训练 +### 2.1 快速开始 + +```python +cd PaddleOCR/ppstructure + +# 下载模型 +mkdir inference && cd inference +# 下载超轻量级中文OCR模型的检测模型并解压 +wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_ppocr_mobile_v2.0_det_infer.tar +# 下载超轻量级中文OCR模型的识别模型并解压 +wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar +# 下载超轻量级英文表格英寸模型并解压 +wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_structure_infer.tar && tar xf en_ppocr_mobile_v2.0_table_structure_infer.tar +cd .. +# 执行预测 +python3 table/predict_table.py --det_model_dir=inference/ch_ppocr_mobile_v2.0_det_infer --rec_model_dir=inference/ch_ppocr_mobile_v2.0_rec_infer --table_model_dir=inference/en_ppocr_mobile_v2.0_table_structure_infer --image_dir=../doc/table/table.jpg --rec_char_dict_path=../ppocr/utils/ppocr_keys_v1.txt --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt --rec_char_type=ch --det_limit_side_len=736 --det_limit_type=min --output ../output/table +``` +运行完成后,每张图片的excel表格会保存到output字段指定的目录下 + +### 2.2 训练 在这一章节中,我们仅介绍表格结构模型的训练,[文字检测](../../doc/doc_ch/detection.md)和[文字识别](../../doc/doc_ch/recognition.md)的模型训练请参考对应的文档。 #### 数据准备 @@ -46,7 +65,7 @@ python3 tools/train.py -c configs/table/table_mv3.yml -o Global.checkpoints=./yo **注意**:`Global.checkpoints`的优先级高于`Global.pretrain_weights`的优先级,即同时指定两个参数时,优先加载`Global.checkpoints`指定的模型,如果`Global.checkpoints`指定的模型路径有误,会加载`Global.pretrain_weights`指定的模型。 -### 2.2 评估 +### 2.3 评估 表格使用 TEDS(Tree-Edit-Distance-based Similarity) 作为模型的评估指标。在进行模型评估之前,需要将pipeline中的三个模型分别导出为inference模型(我们已经提供好),还需要准备评估的gt, gt示例如下: ```json @@ -56,7 +75,7 @@ python3 tools/train.py -c configs/table/table_mv3.yml -o Global.checkpoints=./yo [["", "F", "e", "a", "t", "u", "r", "e", ""], ["", "G", "b", "3", " ", "+", ""], ["", "G", "b", "3", " ", "-", ""], ["", "P", "a", "t", "i", "e", "n", "t", "s", ""], ["6", "2"], ["4", "5"]] ]} ``` -json 中,key为图片名,value为对应的gt,gt是一个由四个item组成的list,每个item分别为 +json 中,key为图片名,value为对应的gt,gt是一个由三个item组成的list,每个item分别为 1. 表格结构的html字符串list 2. 每个cell的坐标 (不包括cell里文字为空的) 3. 每个cell里的文字信息 (不包括cell里文字为空的) @@ -67,10 +86,9 @@ cd PaddleOCR/ppstructure python3 table/eval_table.py --det_model_dir=path/to/det_model_dir --rec_model_dir=path/to/rec_model_dir --table_model_dir=path/to/table_model_dir --image_dir=../doc/table/1.png --rec_char_dict_path=../ppocr/utils/dict/table_dict.txt --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt --rec_char_type=EN --det_limit_side_len=736 --det_limit_type=min --gt_path=path/to/gt.json ``` +### 2.4 预测 -### 2.3 预测 ```python cd PaddleOCR/ppstructure python3 table/predict_table.py --det_model_dir=path/to/det_model_dir --rec_model_dir=path/to/rec_model_dir --table_model_dir=path/to/table_model_dir --image_dir=../doc/table/1.png --rec_char_dict_path=../ppocr/utils/dict/table_dict.txt --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt --rec_char_type=EN --det_limit_side_len=736 --det_limit_type=min --output ../output/table -``` -运行完成后,每张图片的excel表格会保存到output字段指定的目录下 +``` \ No newline at end of file -- GitLab