Merge pull request #3963 from taosdata/feature/crash_gen

Refactoring and enhancing crash_gen tool

Merge pull request #3963 from taosdata/feature/crash_gen
Refactoring and enhancing crash_gen tool
39d9e6b4 · Shengliang Guan · GitHub · 846434dd · 87cd1cc0 · 39d9e6b4
7 changed file
--- a/tests/pytest/crash_gen.sh
+++ b/tests/pytest/crash_gen.sh
@@ -54,6 +54,7 @@ export PYTHONPATH=$(pwd)/../../src/connector/python/linux/python3:$(pwd)
 export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$LIB_DIR

 # Now we are all let, and let's see if we can find a crash. Note we pass all params
+CRASH_GEN_EXEC=crash_gen_bootstrap.py
 if [[ $1 == '--valgrind' ]]; then
  shift
  export PYTHONMALLOC=malloc
@@ -66,14 +67,14 @@ if [[ $1 == '--valgrind' ]]; then
    --leak-check=yes \
    --suppressions=crash_gen/valgrind_taos.supp \
    $PYTHON_EXEC \
-    ./crash_gen/crash_gen.py $@ > $VALGRIND_OUT 2> $VALGRIND_ERR 
+    $CRASH_GEN_EXEC $@ > $VALGRIND_OUT 2> $VALGRIND_ERR 
 elif [[ $1 == '--helgrind' ]]; then
  shift
  valgrind  \
    --tool=helgrind \
    $PYTHON_EXEC \
-    ./crash_gen/crash_gen.py $@
+    $CRASH_GEN_EXEC $@
 else
-  $PYTHON_EXEC ./crash_gen/crash_gen.py $@
+  $PYTHON_EXEC $CRASH_GEN_EXEC $@
 fi

--- a/tests/pytest/crash_gen/README.md
+++ b/tests/pytest/crash_gen/README.md
+<center><h1>User's Guide to the Crash_Gen Tool</h1></center>
+
+# Introduction
+
+To effectively test and debug our TDengine product, we have developed a simple tool to 
+exercise various functions of the system in a randomized fashion, hoping to expose 
+maximum number of problems, hopefully without a pre-determined scenario.
+
+# Preparation
+
+To run this tool, please ensure the followed preparation work is done first.
+
+1. Fetch a copy of the TDengine source code, and build it successfully in the `build/` 
+    directory
+1. Ensure that the system has Python3.8 or above properly installed. We use 
+    Ubuntu 20.04LTS as our own development environment, and suggest you also use such
+    an environment if possible.
+
+# Simple Execution
+
+To run the tool with the simplest method, follow the steps below:
+
+1. Open a terminal window, start the `taosd` service in the `build/` directory 
+    (or however you prefer to start the `taosd` service)
+1. Open another terminal window, go into the `tests/pytest/` directory, and
+    run `./crash_gen.sh -p -t 3 -s 10` (change the two parameters here as you wish)
+1. Watch the output to the end and see if you get a `SUCCESS` or `FAILURE`
+
+That's it!
+
+# Running Clusters
+
+This tool also makes it easy to test/verify the clustering capabilities of TDengine. You
+can start a cluster quite easily with the following command:
+
+```
+$ cd tests/pytest/
+$ ./crash_gen.sh -e -o 3
+```
+
+The `-e` option above tells the tool to start the service, and do not run any tests, while 
+the `-o 3` option tells the tool to start 3 DNodes and join them together in a cluster. 
+Obviously you can adjust the the number here.
+
+## Behind the Scenes
+
+When the tool runs a cluster, it users a number of directories, each holding the information
+for a single DNode, see:
+
+```
+$ ls build/cluster*
+build/cluster_dnode_0:
+cfg  data  log
+
+build/cluster_dnode_1:
+cfg  data  log
+
+build/cluster_dnode_2:
+cfg  data  log
+```
+
+Therefore, when something goes wrong and you want to reset everything with the cluster, simple
+erase all the files:
+
+```
+$ rm -rf build/cluster_dnode_*
+```
+
+## Addresses and Ports
+
+The DNodes in the cluster all binds the the `127.0.0.1` IP address (for now anyway), and
+uses port 6030 for the first DNode, and 6130 for the 2nd one, and so on.
+
+## Testing Against a Cluster
+
+In a separate terminal window, you can invoke the tool in client mode and test against
+a cluster, such as:
+
+```
+$ ./crash_gen.sh -p -t 10 -s 100 -i 3
+```
+
+Here the `-i` option tells the tool to always create tables with 3 replicas, and run 
+all tests against such tables.
+
+# Additional Features
+
+The exhaustive features of the tool is available through the `-h` option:
+
+```
+$ ./crash_gen.sh -h
+usage: crash_gen_bootstrap.py [-h] [-a] [-b MAX_DBS] [-c CONNECTOR_TYPE] [-d] [-e] [-g IGNORE_ERRORS] [-i MAX_REPLICAS] [-l] [-n] [-o NUM_DNODES] [-p] [-r]
+                              [-s MAX_STEPS] [-t NUM_THREADS] [-v] [-x]
+
+TDengine Auto Crash Generator (PLEASE NOTICE the Prerequisites Below)
+---------------------------------------------------------------------
+1. You build TDengine in the top level ./build directory, as described in offical docs
+2. You run the server there before this script: ./build/bin/taosd -c test/cfg
+
+optional arguments:
+  -h, --help            show this help message and exit
+  -a, --auto-start-service
+                        Automatically start/stop the TDengine service (default: false)
+  -b MAX_DBS, --max-dbs MAX_DBS
+                        Maximum number of DBs to keep, set to disable dropping DB. (default: 0)
+  -c CONNECTOR_TYPE, --connector-type CONNECTOR_TYPE
+                        Connector type to use: native, rest, or mixed (default: 10)
+  -d, --debug           Turn on DEBUG mode for more logging (default: false)
+  -e, --run-tdengine    Run TDengine service in foreground (default: false)
+  -g IGNORE_ERRORS, --ignore-errors IGNORE_ERRORS
+                        Ignore error codes, comma separated, 0x supported (default: None)
+  -i MAX_REPLICAS, --max-replicas MAX_REPLICAS
+                        Maximum number of replicas to use, when testing against clusters. (default: 1)
+  -l, --larger-data     Write larger amount of data during write operations (default: false)
+  -n, --dynamic-db-table-names
+                        Use non-fixed names for dbs/tables, useful for multi-instance executions (default: false)
+  -o NUM_DNODES, --num-dnodes NUM_DNODES
+                        Number of Dnodes to initialize, used with -e option. (default: 1)
+  -p, --per-thread-db-connection
+                        Use a single shared db connection (default: false)
+  -r, --record-ops      Use a pair of always-fsynced fils to record operations performing + performed, for power-off tests (default: false)
+  -s MAX_STEPS, --max-steps MAX_STEPS
+                        Maximum number of steps to run (default: 100)
+  -t NUM_THREADS, --num-threads NUM_THREADS
+                        Number of threads to run (default: 10)
+  -v, --verify-data     Verify data written in a number of places by reading back (default: false)
+  -x, --continue-on-exception
+                        Continue execution after encountering unexpected/disallowed errors/exceptions (default: false)
+```
+
--- a/tests/pytest/crash_gen/crash_gen.py
+++ b/tests/pytest/crash_gen/crash_gen.py
--- a/tests/pytest/crash_gen/db.py
+++ b/tests/pytest/crash_gen/db.py
+from __future__ import annotations
+
+import sys
+import time
+import threading
+import requests
+from requests.auth import HTTPBasicAuth
+
+import taos
+from util.sql import *
+from util.cases import *
+from util.dnodes import *
+from util.log import *
+
+from .misc import Logging, CrashGenError, Helper, Dice
+import os
+import datetime
+# from .service_manager import TdeInstance
+
+class DbConn:
+    TYPE_NATIVE = "native-c"
+    TYPE_REST =   "rest-api"
+    TYPE_INVALID = "invalid"
+
+    @classmethod
+    def create(cls, connType, dbTarget):
+        if connType == cls.TYPE_NATIVE:
+            return DbConnNative(dbTarget)
+        elif connType == cls.TYPE_REST:
+            return DbConnRest(dbTarget)
+        else:
+            raise RuntimeError(
+                "Unexpected connection type: {}".format(connType))
+
+    @classmethod
+    def createNative(cls, dbTarget) -> DbConn:
+        return cls.create(cls.TYPE_NATIVE, dbTarget)
+
+    @classmethod
+    def createRest(cls, dbTarget) -> DbConn:
+        return cls.create(cls.TYPE_REST, dbTarget)
+
+    def __init__(self, dbTarget):
+        self.isOpen = False
+        self._type = self.TYPE_INVALID
+        self._lastSql = None
+        self._dbTarget = dbTarget
+
+    def __repr__(self):
+        return "[DbConn: type={}, target={}]".format(self._type, self._dbTarget)
+
+    def getLastSql(self):
+        return self._lastSql
+
+    def open(self):
+        if (self.isOpen):
+            raise RuntimeError("Cannot re-open an existing DB connection")
+
+        # below implemented by child classes
+        self.openByType()
+
+        Logging.debug("[DB] data connection opened: {}".format(self))
+        self.isOpen = True
+
+    def close(self):
+        raise RuntimeError("Unexpected execution, should be overriden")
+
+    def queryScalar(self, sql) -> int:
+        return self._queryAny(sql)
+
+    def queryString(self, sql) -> str:
+        return self._queryAny(sql)
+
+    def _queryAny(self, sql):  # actual query result as an int
+        if (not self.isOpen):
+            raise RuntimeError("Cannot query database until connection is open")
+        nRows = self.query(sql)
+        if nRows != 1:
+            raise taos.error.ProgrammingError(
+                "Unexpected result for query: {}, rows = {}".format(sql, nRows), 
+                (0x991 if nRows==0 else 0x992)
+            )
+        if self.getResultRows() != 1 or self.getResultCols() != 1:
+            raise RuntimeError("Unexpected result set for query: {}".format(sql))
+        return self.getQueryResult()[0][0]
+
+    def use(self, dbName):
+        self.execute("use {}".format(dbName))
+
+    def existsDatabase(self, dbName: str):
+        ''' Check if a certain database exists '''
+        self.query("show databases")
+        dbs = [v[0] for v in self.getQueryResult()] # ref: https://stackoverflow.com/questions/643823/python-list-transformation
+        # ret2 = dbName in dbs
+        # print("dbs = {}, str = {}, ret2={}, type2={}".format(dbs, dbName,ret2, type(dbName)))
+        return dbName in dbs # TODO: super weird type mangling seen, once here
+
+    def hasTables(self):
+        return self.query("show tables") > 0
+
+    def execute(self, sql):
+        ''' Return the number of rows affected'''
+        raise RuntimeError("Unexpected execution, should be overriden")
+
+    def safeExecute(self, sql):
+        '''Safely execute any SQL query, returning True/False upon success/failure'''
+        try:
+            self.execute(sql)
+            return True # ignore num of results, return success
+        except taos.error.ProgrammingError as err:
+            return False # failed, for whatever TAOS reason
+        # Not possile to reach here, non-TAOS exception would have been thrown
+
+    def query(self, sql) -> int: # return num rows returned
+        ''' Return the number of rows affected'''
+        raise RuntimeError("Unexpected execution, should be overriden")
+
+    def openByType(self):
+        raise RuntimeError("Unexpected execution, should be overriden")
+
+    def getQueryResult(self):
+        raise RuntimeError("Unexpected execution, should be overriden")
+
+    def getResultRows(self):
+        raise RuntimeError("Unexpected execution, should be overriden")
+
+    def getResultCols(self):
+        raise RuntimeError("Unexpected execution, should be overriden")
+
+# Sample: curl -u root:taosdata -d "show databases" localhost:6020/rest/sql
+
+
+class DbConnRest(DbConn):
+    REST_PORT_INCREMENT = 11
+
+    def __init__(self, dbTarget: DbTarget):
+        super().__init__(dbTarget)
+        self._type = self.TYPE_REST
+        restPort = dbTarget.port + 11
+        self._url = "http://{}:{}/rest/sql".format(
+            dbTarget.hostAddr, dbTarget.port + self.REST_PORT_INCREMENT)
+        self._result = None
+
+    def openByType(self):  # Open connection        
+        pass  # do nothing, always open
+
+    def close(self):
+        if (not self.isOpen):
+            raise RuntimeError("Cannot clean up database until connection is open")
+        # Do nothing for REST
+        Logging.debug("[DB] REST Database connection closed")
+        self.isOpen = False
+
+    def _doSql(self, sql):
+        self._lastSql = sql # remember this, last SQL attempted
+        try:
+            r = requests.post(self._url, 
+                data = sql,
+                auth = HTTPBasicAuth('root', 'taosdata'))         
+        except:
+            print("REST API Failure (TODO: more info here)")
+            raise
+        rj = r.json()
+        # Sanity check for the "Json Result"
+        if ('status' not in rj):
+            raise RuntimeError("No status in REST response")
+
+        if rj['status'] == 'error':  # clearly reported error
+            if ('code' not in rj):  # error without code
+                raise RuntimeError("REST error return without code")
+            errno = rj['code']  # May need to massage this in the future
+            # print("Raising programming error with REST return: {}".format(rj))
+            raise taos.error.ProgrammingError(
+                rj['desc'], errno)  # todo: check existance of 'desc'
+
+        if rj['status'] != 'succ':  # better be this
+            raise RuntimeError(
+                "Unexpected REST return status: {}".format(
+                    rj['status']))
+
+        nRows = rj['rows'] if ('rows' in rj) else 0
+        self._result = rj
+        return nRows
+
+    def execute(self, sql):
+        if (not self.isOpen):
+            raise RuntimeError(
+                "Cannot execute database commands until connection is open")
+        Logging.debug("[SQL-REST] Executing SQL: {}".format(sql))
+        nRows = self._doSql(sql)
+        Logging.debug(
+            "[SQL-REST] Execution Result, nRows = {}, SQL = {}".format(nRows, sql))
+        return nRows
+
+    def query(self, sql):  # return rows affected
+        return self.execute(sql)
+
+    def getQueryResult(self):
+        return self._result['data']
+
+    def getResultRows(self):
+        print(self._result)
+        raise RuntimeError("TBD") # TODO: finish here to support -v under -c rest
+        # return self._tdSql.queryRows
+
+    def getResultCols(self):
+        print(self._result)
+        raise RuntimeError("TBD")
+
+    # Duplicate code from TDMySQL, TODO: merge all this into DbConnNative
+
+
+class MyTDSql:
+    # Class variables
+    _clsLock = threading.Lock() # class wide locking
+    longestQuery = None # type: str
+    longestQueryTime = 0.0 # seconds
+    lqStartTime = 0.0
+    # lqEndTime = 0.0 # Not needed, as we have the two above already
+
+    def __init__(self, hostAddr, cfgPath):
+        # Make the DB connection
+        self._conn = taos.connect(host=hostAddr, config=cfgPath) 
+        self._cursor = self._conn.cursor()
+
+        self.queryRows = 0
+        self.queryCols = 0
+        self.affectedRows = 0
+
+    # def init(self, cursor, log=True):
+    #     self.cursor = cursor
+        # if (log):
+        #     caller = inspect.getframeinfo(inspect.stack()[1][0])
+        #     self.cursor.log(caller.filename + ".sql")
+
+    def close(self):
+        self._cursor.close() # can we double close?
+        self._conn.close() # TODO: very important, cursor close does NOT close DB connection!
+        self._cursor.close()
+
+    def _execInternal(self, sql):
+        startTime = time.time() 
+        ret = self._cursor.execute(sql)
+        # print("\nSQL success: {}".format(sql))
+        queryTime =  time.time() - startTime
+        # Record the query time
+        cls = self.__class__
+        if queryTime > (cls.longestQueryTime + 0.01) :
+            with cls._clsLock:
+                cls.longestQuery = sql
+                cls.longestQueryTime = queryTime
+                cls.lqStartTime = startTime
+        return ret
+
+    def query(self, sql):
+        self.sql = sql
+        try:
+            self._execInternal(sql)
+            self.queryResult = self._cursor.fetchall()
+            self.queryRows = len(self.queryResult)
+            self.queryCols = len(self._cursor.description)
+        except Exception as e:
+            # caller = inspect.getframeinfo(inspect.stack()[1][0])
+            # args = (caller.filename, caller.lineno, sql, repr(e))
+            # tdLog.exit("%s(%d) failed: sql:%s, %s" % args)
+            raise
+        return self.queryRows
+
+    def execute(self, sql):
+        self.sql = sql
+        try:
+            self.affectedRows = self._execInternal(sql)
+        except Exception as e:
+            # caller = inspect.getframeinfo(inspect.stack()[1][0])
+            # args = (caller.filename, caller.lineno, sql, repr(e))
+            # tdLog.exit("%s(%d) failed: sql:%s, %s" % args)
+            raise
+        return self.affectedRows
+
+class DbTarget:
+    def __init__(self, cfgPath, hostAddr, port):
+        self.cfgPath  = cfgPath
+        self.hostAddr = hostAddr
+        self.port     = port
+    
+    def __repr__(self):
+        return "[DbTarget: cfgPath={}, host={}:{}]".format(
+            Helper.getFriendlyPath(self.cfgPath), self.hostAddr, self.port)
+
+    def getEp(self):
+        return "{}:{}".format(self.hostAddr, self.port)
+
+class DbConnNative(DbConn):
+    # Class variables
+    _lock = threading.Lock()
+    # _connInfoDisplayed = False # TODO: find another way to display this
+    totalConnections = 0 # Not private
+
+    def __init__(self, dbTarget):
+        super().__init__(dbTarget)
+        self._type = self.TYPE_NATIVE
+        self._conn = None
+        # self._cursor = None        
+
+    def openByType(self):  # Open connection
+        # global gContainer
+        # tInst = tInst or gContainer.defTdeInstance # set up in ClientManager, type: TdeInstance
+        # cfgPath = self.getBuildPath() + "/test/cfg"
+        # cfgPath  = tInst.getCfgDir()
+        # hostAddr = tInst.getHostAddr()
+
+        cls = self.__class__ # Get the class, to access class variables
+        with cls._lock: # force single threading for opening DB connections. # TODO: whaaat??!!!
+            dbTarget = self._dbTarget
+            # if not cls._connInfoDisplayed:
+            #     cls._connInfoDisplayed = True # updating CLASS variable
+            Logging.debug("Initiating TAOS native connection to {}".format(dbTarget))                    
+            # Make the connection         
+            # self._conn = taos.connect(host=hostAddr, config=cfgPath)  # TODO: make configurable
+            # self._cursor = self._conn.cursor()
+            # Record the count in the class
+            self._tdSql = MyTDSql(dbTarget.hostAddr, dbTarget.cfgPath) # making DB connection
+            cls.totalConnections += 1 
+        
+        self._tdSql.execute('reset query cache')
+        # self._cursor.execute('use db') # do this at the beginning of every
+
+        # Open connection
+        # self._tdSql = MyTDSql()
+        # self._tdSql.init(self._cursor)
+        
+    def close(self):
+        if (not self.isOpen):
+            raise RuntimeError("Cannot clean up database until connection is open")
+        self._tdSql.close()
+        # Decrement the class wide counter
+        cls = self.__class__ # Get the class, to access class variables
+        with cls._lock:
+            cls.totalConnections -= 1
+
+        Logging.debug("[DB] Database connection closed")
+        self.isOpen = False
+
+    def execute(self, sql):
+        if (not self.isOpen):
+            raise RuntimeError("Cannot execute database commands until connection is open")
+        Logging.debug("[SQL] Executing SQL: {}".format(sql))
+        self._lastSql = sql
+        nRows = self._tdSql.execute(sql)
+        Logging.debug(
+            "[SQL] Execution Result, nRows = {}, SQL = {}".format(
+                nRows, sql))
+        return nRows
+
+    def query(self, sql):  # return rows affected
+        if (not self.isOpen):
+            raise RuntimeError(
+                "Cannot query database until connection is open")
+        Logging.debug("[SQL] Executing SQL: {}".format(sql))
+        self._lastSql = sql
+        nRows = self._tdSql.query(sql)
+        Logging.debug(
+            "[SQL] Query Result, nRows = {}, SQL = {}".format(
+                nRows, sql))
+        return nRows
+        # results are in: return self._tdSql.queryResult
+
+    def getQueryResult(self):
+        return self._tdSql.queryResult
+
+    def getResultRows(self):
+        return self._tdSql.queryRows
+
+    def getResultCols(self):
+        return self._tdSql.queryCols
+
+
+class DbManager():
+    ''' This is a wrapper around DbConn(), to make it easier to use. 
+
+        TODO: rename this to DbConnManager
+    '''
+    def __init__(self, cType, dbTarget):
+        # self.tableNumQueue = LinearQueue() # TODO: delete?
+        # self.openDbServerConnection()
+        self._dbConn = DbConn.createNative(dbTarget) if (
+            cType == 'native') else DbConn.createRest(dbTarget)
+        try:
+            self._dbConn.open()  # may throw taos.error.ProgrammingError: disconnected
+        except taos.error.ProgrammingError as err:
+            # print("Error type: {}, msg: {}, value: {}".format(type(err), err.msg, err))
+            if (err.msg == 'client disconnected'):  # cannot open DB connection
+                print(
+                    "Cannot establish DB connection, please re-run script without parameter, and follow the instructions.")
+                sys.exit(2)
+            else:
+                print("Failed to connect to DB, errno = {}, msg: {}"
+                    .format(Helper.convertErrno(err.errno), err.msg))
+                raise
+        except BaseException:
+            print("[=] Unexpected exception")
+            raise
+
+        # Do this after dbConn is in proper shape
+        # Moved to Database()
+        # self._stateMachine = StateMechine(self._dbConn)
+
+    def getDbConn(self):
+        return self._dbConn
+
+    # TODO: not used any more, to delete
+    def pickAndAllocateTable(self):  # pick any table, and "use" it
+        return self.tableNumQueue.pickAndAllocate()
+
+    # TODO: Not used any more, to delete
+    def addTable(self):
+        with self._lock:
+            tIndex = self.tableNumQueue.push()
+        return tIndex
+
+    # Not used any more, to delete
+    def releaseTable(self, i):  # return the table back, so others can use it
+        self.tableNumQueue.release(i)    
+
+    # TODO: not used any more, delete
+    def getTableNameToDelete(self):
+        tblNum = self.tableNumQueue.pop()  # TODO: race condition!
+        if (not tblNum):  # maybe false
+            return False
+
+        return "table_{}".format(tblNum)
+
+    def cleanUp(self):
+        self._dbConn.close()
+
--- a/tests/pytest/crash_gen/misc.py
+++ b/tests/pytest/crash_gen/misc.py
+import threading
+import random
+import logging
+import os
+
+
+class CrashGenError(Exception):
+    def __init__(self, msg=None, errno=None):
+        self.msg = msg
+        self.errno = errno
+
+    def __str__(self):
+        return self.msg
+
+
+class LoggingFilter(logging.Filter):
+    def filter(self, record: logging.LogRecord):
+        if (record.levelno >= logging.INFO):
+            return True  # info or above always log
+
+        # Commenting out below to adjust...
+
+        # if msg.startswith("[TRD]"):
+        #     return False
+        return True
+
+
+class MyLoggingAdapter(logging.LoggerAdapter):
+    def process(self, msg, kwargs):
+        return "[{}] {}".format(threading.get_ident() % 10000, msg), kwargs
+        # return '[%s] %s' % (self.extra['connid'], msg), kwargs
+
+
+class Logging:
+    logger = None
+
+    @classmethod
+    def getLogger(cls):
+        return logger
+
+    @classmethod
+    def clsInit(cls, gConfig): # TODO: refactor away gConfig
+        if cls.logger:
+            return
+        
+        # Logging Stuff
+        # global misc.logger
+        _logger = logging.getLogger('CrashGen')  # real logger
+        _logger.addFilter(LoggingFilter())
+        ch = logging.StreamHandler()
+        _logger.addHandler(ch)
+
+        # Logging adapter, to be used as a logger
+        print("setting logger variable")
+        # global logger
+        cls.logger = MyLoggingAdapter(_logger, [])
+
+        if (gConfig.debug):
+            cls.logger.setLevel(logging.DEBUG)  # default seems to be INFO
+        else:
+            cls.logger.setLevel(logging.INFO)
+
+    @classmethod
+    def info(cls, msg):
+        cls.logger.info(msg)
+
+    @classmethod
+    def debug(cls, msg):
+        cls.logger.debug(msg)
+
+    @classmethod
+    def warning(cls, msg):
+        cls.logger.warning(msg)
+
+    @classmethod
+    def error(cls, msg):
+        cls.logger.error(msg)
+
+class Status:
+    STATUS_STARTING = 1
+    STATUS_RUNNING  = 2
+    STATUS_STOPPING = 3
+    STATUS_STOPPED  = 4
+
+    def __init__(self, status):
+        self.set(status)
+
+    def __repr__(self):
+        return "[Status: v={}]".format(self._status)
+
+    def set(self, status):
+        self._status = status
+
+    def get(self):
+        return self._status
+
+    def isStarting(self):
+        return self._status == Status.STATUS_STARTING
+
+    def isRunning(self):
+        # return self._thread and self._thread.is_alive()
+        return self._status == Status.STATUS_RUNNING
+
+    def isStopping(self):
+        return self._status == Status.STATUS_STOPPING
+
+    def isStopped(self):
+        return self._status == Status.STATUS_STOPPED
+
+    def isStable(self):
+        return self.isRunning() or self.isStopped()
+
+# Deterministic random number generator
+class Dice():
+    seeded = False  # static, uninitialized
+
+    @classmethod
+    def seed(cls, s):  # static
+        if (cls.seeded):
+            raise RuntimeError(
+                "Cannot seed the random generator more than once")
+        cls.verifyRNG()
+        random.seed(s)
+        cls.seeded = True  # TODO: protect against multi-threading
+
+    @classmethod
+    def verifyRNG(cls):  # Verify that the RNG is determinstic
+        random.seed(0)
+        x1 = random.randrange(0, 1000)
+        x2 = random.randrange(0, 1000)
+        x3 = random.randrange(0, 1000)
+        if (x1 != 864 or x2 != 394 or x3 != 776):
+            raise RuntimeError("System RNG is not deterministic")
+
+    @classmethod
+    def throw(cls, stop):  # get 0 to stop-1
+        return cls.throwRange(0, stop)
+
+    @classmethod
+    def throwRange(cls, start, stop):  # up to stop-1
+        if (not cls.seeded):
+            raise RuntimeError("Cannot throw dice before seeding it")
+        return random.randrange(start, stop)
+
+    @classmethod
+    def choice(cls, cList):
+        return random.choice(cList)
+
+class Helper:
+    @classmethod
+    def convertErrno(cls, errno):
+        return errno if (errno > 0) else 0x80000000 + errno
+
+    @classmethod
+    def getFriendlyPath(cls, path): # returns .../xxx/yyy
+        ht1 = os.path.split(path)
+        ht2 = os.path.split(ht1[0])
+        return ".../" + ht2[1] + '/' + ht1[1]
+
+
+class Progress:
+    STEP_BOUNDARY = 0
+    BEGIN_THREAD_STEP = 1
+    END_THREAD_STEP   = 2
+    SERVICE_HEART_BEAT= 3
+    tokens = {
+        STEP_BOUNDARY:      '.',
+        BEGIN_THREAD_STEP:  '[',
+        END_THREAD_STEP:    '] ',
+        SERVICE_HEART_BEAT: '.Y.'
+    }
+
+    @classmethod
+    def emit(cls, token):
+        print(cls.tokens[token], end="", flush=True)
--- a/tests/pytest/crash_gen/service_manager.py
+++ b/tests/pytest/crash_gen/service_manager.py
--- a/tests/pytest/crash_gen_bootstrap.py
+++ b/tests/pytest/crash_gen_bootstrap.py
+# -----!/usr/bin/python3.7
+###################################################################
+#           Copyright (c) 2016 by TAOS Technologies, Inc.
+#                     All rights reserved.
+#
+#  This file is proprietary and confidential to TAOS Technologies.
+#  No part of this file may be reproduced, stored, transmitted,
+#  disclosed or used in any form or by any means other than as
+#  expressly provided by the written permission from Jianhui Tao
+#
+###################################################################
+
+import sys
+from crash_gen.crash_gen import MainExec
+
+if __name__ == "__main__":
+    
+    mExec = MainExec()
+    mExec.init()
+    exitCode = mExec.run()
+
+    print("Exiting with code: {}".format(exitCode))
+    sys.exit(exitCode)