style.md 8.6 KB
Newer Older
F
feilong 已提交
1 2 3 4
# 代码规范(风格)

一个正式的项目一般都有对应的代码规范,代码规范约定了如何在项目中编写代码。一般来说,个人开发可以使用任何自己喜欢的代码规范,但是在团队开发中,一般要遵循团队约定的代码规范。

F
feilong 已提交
5
不同的编程语言,都有一些著名的代码规范。例如著名的`K&R`是指《The C Programming Language》一书的作者`Kernighan``Ritchie`二人,这是世界上第一本介绍C语言的书,而`K&R风格`即指他们在该书中书写代码所使用的风格。
F
feilong 已提交
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

Google 有一个代码风格指引:[Google Style Guides](https://google.github.io/styleguide/),里面包含了大部分主流编程语言的编码规范。以下`Python三引号字符串`摘录该规范的核心描述:

```python
google_style_guide = '''
Every major open-source project has its own style guide: a set of conventions (sometimes arbitrary) about how to write code for that project. It is much easier to understand a large codebase when all the code in it is in a consistent style.

“Style” covers a lot of ground, from “use camelCase for variable names” to “never use global variables” to “never use exceptions.” This project (google/styleguide) links to the style guidelines we use for Google code. If you are modifying a project that originated at Google, you may be pointed to this page to see the style guides that apply to that project.

This project holds the C++ Style Guide, C# Style Guide, Swift Style Guide, Objective-C Style Guide, Java Style Guide, Python Style Guide, R Style Guide, Shell Style Guide, HTML/CSS Style Guide, JavaScript Style Guide, TypeScript Style Guide, AngularJS Style Guide, Common Lisp Style Guide, and Vimscript Style Guide. This project also contains cpplint, a tool to assist with style guide compliance, and google-c-style.el, an Emacs settings file for Google style.
'''
```

CodeChina上有一个中文镜像仓库:[zh-google-styleguide](https://codechina.csdn.net/sj15102696860/zh-google-styleguide),可以看到 `Google Python 代码风格指引`十分简短:[python_style_rules](https://google-styleguide.readthedocs.io/zh_CN/latest/google-python-styleguide/python_style_rules.html)

此外,Python 官方代码风格文档是:[PEP8](https://legacy.python.org/dev/peps/pep-0008/)。你会问`PEP`是什么?`PEP``Python Enhancement Proposals`的缩写。一个`PEP`是一份为Python社区提供各种增强功能的技术规格,也是提交新特性,以便让社区指出问题,精确化技术文档的提案。

实际的开发中可以通过配置开发环境的插件来辅助自动化检查代码风格。下面的`Python三引号字符串`描述了一组相关信息:

```python
python_style_guides = '''
* Python 代码风格指南',
    * [google-python-styleguide_zh_cn](https://zh-google-styleguide.readthedocs.io/en/latest/google-python-styleguide/python_style_rules /)
    * [PEP8](https://legacy.python.org/dev/peps/pep-0008/)
* 代码风格和自动完成工具链
    * 基本工具
        * [pylint](https://pylint.org/)
        * [autopep8](https://pypi.org/project/autopep8/)
    * Visual Studio Code Python 开发基本插件
        * Pylance
        * Python Path
        * Python-autopep8
'''
```

请编写一段单词统计Python代码,统计上述两个`Python三引号字符串`里英文单词的词频。要求:

* 单词请忽略大小写
* 使用数组`splits = ['\n', ' ', '-', ':', '/', '*', '_', '(', ')', '"', '”', '“',']','[']`来切割单词
* 输出词频最高的5个单词和词频信息。

基本代码框架如下:

```python
F
feilong 已提交
50
# -*- coding: UTF-8 -*-
F
feilong 已提交
51 52 53 54 55 56 57
def top_words(splits, text, top_n=5):
    i = 0
    word_dict = {}
    chars = []
    while i < len(text):
        c = text[i]
        if c in splits:
F
feilong 已提交
58
            # 过滤掉分隔字符串
F
feilong 已提交
59 60 61 62
            while i+1 < len(text) and text[i+1] in splits:
                i += 1
            word = ''.join(chars).lower()

F
feilong 已提交
63
            # 统计词频
F
feilong 已提交
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94
            # TODO(You): 请在此添加代码

            chars = []
        else:
            chars.append(c)

        i += 1

    word_list = list(word_dict.values())
    top_n = min(top_n, len(word_list))
    word_list.sort(key=lambda word_info: word_info['count'], reverse=True)
    return word_list[0:top_n]

if __name__ == '__main__':
    google_style_guide = ...
    python_style_guides = ...
    splits = [' ', '-', ':', '/', '*', '_', '(', ')', '"', '”', '“']

    tops = top_words(splits, google_style_guide+python_style_guides)

    print('单词排行榜')
    print('--------')
    i = 0
    while i < len(tops):
        top = tops[i]
        word = top['word']
        count = top['count']
        print(f'{i+1}. 单词:{word}, 词频:{count}')
        i += 1
```

F
feilong 已提交
95
预期的输出结果为:
F
feilong 已提交
96

F
feilong 已提交
97 98 99 100 101 102 103 104 105 106 107 108
```bash
单词排行榜
--------
1. 单词:style, 词频:20
2. 单词:guide,, 词频:13
3. 单词:to, 词频:9
4. 单词:python, 词频:8
5. 单词:google, 词频:7
```

以下选项是对代码中`TODO`部分的多种实现,你能找出以下实现<span style="color:red">错误</span>的选项吗?

F
feilong 已提交
109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176
## template

```python
def top_words(splits, text, top_n=5):
    i = 0
    word_dict = {}
    chars = []
    while i < len(text):
        c = text[i]
        if c in splits:
            while i+1 < len(text) and text[i+1] in splits:
                i += 1
            word = ''.join(chars).lower()

            word_info = word_dict.get(word, {'word': word, 'count': 0})
            word_info['count'] += 1
            word_dict[word] = word_info

            chars = []
        else:
            chars.append(c)

        i += 1

    word_list = list(word_dict.values())
    top_n = min(top_n, len(word_list))
    word_list.sort(key=lambda word_info: word_info['count'], reverse=True)
    return word_list[0:top_n]


if __name__ == '__main__':
    google_style_guide = '''
Every major open-source project has its own style guide: a set of conventions (sometimes arbitrary) about how to write code for that project. It is much easier to understand a large codebase when all the code in it is in a consistent style.

“Style” covers a lot of ground, from “use camelCase for variable names” to “never use global variables” to “never use exceptions.” This project (google/styleguide) links to the style guidelines we use for Google code. If you are modifying a project that originated at Google, you may be pointed to this page to see the style guides that apply to that project.

This project holds the C++ Style Guide, C# Style Guide, Swift Style Guide, Objective-C Style Guide, Java Style Guide, Python Style Guide, R Style Guide, Shell Style Guide, HTML/CSS Style Guide, JavaScript Style Guide, TypeScript Style Guide, AngularJS Style Guide, Common Lisp Style Guide, and Vimscript Style Guide. This project also contains cpplint, a tool to assist with style guide compliance, and google-c-style.el, an Emacs settings file for Google style.
'''

    python_style_guides = '''
* Python 代码风格指南',
    * [google-python-styleguide_zh_cn](https://zh-google-styleguide.readthedocs.io/en/latest/google-python-styleguide/python_style_rules /)
    * [PEP8](https://legacy.python.org/dev/peps/pep-0008/)
* 代码风格和自动完成工具链
    * 基本工具
        * [pylint](https://pylint.org/)
        * [autopep8](https://pypi.org/project/autopep8/)
    * Visual Studio Code Python 开发基本插件
        * Pylance
        * Python Path
        * Python-autopep8
'''

    splits = ['\n', ' ', '-', ':', '/', '*',
              '_', '(', ')', '"', '”', '“', '[', ']']

    tops = top_words(splits, google_style_guide+python_style_guides)

    print('单词排行榜')
    print('--------')
    i = 0
    while i < len(tops):
        top = tops[i]
        word = top['word']
        count = top['count']
        print(f'{i+1}. 单词:{word}, 词频:{count}')
        i += 1
```
F
feilong 已提交
177

F
feilong 已提交
178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193
## 答案

```python
if not word in word_dict:
    word_info = {'word': word, 'count': 0}
    word_dict[word] = word_info

word_info['count'] += 1
```

## 选项

### 如果没有就初始化词频个数为0,再统一自增

```python
word_info = word_dict.get(word)
F
feilong 已提交
194
if word_info is None:
F
feilong 已提交
195 196 197 198 199 200 201 202 203 204
    word_info = {'word': word, 'count': 0}
    word_dict[word] = word_info

word_info['count'] += 1
```

### 如果没有就初始化词频个数为1,否则自增

```python
word_info = word_dict.get(word)
F
feilong 已提交
205
if word_info is None:
F
feilong 已提交
206 207 208 209 210 211 212 213 214 215 216 217 218
    word_info = {'word': word, 'count': 1}
    word_dict[word] = word_info
else:
    word_info['count'] += 1
```

### 使用 Python 的 get 默认值

```python
word_info = word_dict.get(word, {'word': word, 'count': 0})
word_info['count'] += 1
word_dict[word] = word_info
```