samples/py_custom_module/README.md - 3p/openxla/iree - Git at Google

 # Custom modules in Python sample

 This sample illustrates how to define a custom module in the Python API,
 with a pure Python implementation, and compiling an overall program that
 can use it.

 This builds on the capabilities ot the `custom_module` sample, which
 demonstrates C-based extension modules -- applying the same basics to
 Python. Some features are not yet implemented on the Python side, and
 the API is lower level than we should ultimately have. However, as
 is demonstrated, it can do some not trivial things.

 ## Sample description

 To show off some of the capabilities, this sample:

 * Demonstrates how to define a custom Python function which accepts both
   a buffer and a variant list. Within the implementation, the buffer is
   wrapped by a numpy array for use.
 * Module state is kept for the detokenizer state, keeping track of whether
   we are at the start of text or sentence. Real detokenizers are much
   more complex and would likely involve an opaque module custom type
   (not yet implemented in Python).
 * A global in the main program is used to accumulate fragments by
   the `@detokenizer.accumtokens` function.
 * The `@detokenizer.jointokens` will format and emit the text corresponding
   to accumulated tokens, respecting sentence boundaries and previous
   state.
 * A `reset` function is exported which resets the accumulated tokens and
   the detokenizer state.

 A real text model would be organized differently, but this example should
 suffice to show that many of these advanced integration concepts are just
 simple code.

 A future version of this sample will embed the detokenizer vocabulary as
 rodata in the main module and use that to initialize the internal lookup
 table.
	# Custom modules in Python sample

	This sample illustrates how to define a custom module in the Python API,
	with a pure Python implementation, and compiling an overall program that
	can use it.

	This builds on the capabilities ot the `custom_module` sample, which
	demonstrates C-based extension modules -- applying the same basics to
	Python. Some features are not yet implemented on the Python side, and
	the API is lower level than we should ultimately have. However, as
	is demonstrated, it can do some not trivial things.

	## Sample description

	To show off some of the capabilities, this sample:

	* Demonstrates how to define a custom Python function which accepts both
	a buffer and a variant list. Within the implementation, the buffer is
	wrapped by a numpy array for use.
	* Module state is kept for the detokenizer state, keeping track of whether
	we are at the start of text or sentence. Real detokenizers are much
	more complex and would likely involve an opaque module custom type
	(not yet implemented in Python).
	* A global in the main program is used to accumulate fragments by
	the `@detokenizer.accumtokens` function.
	* The `@detokenizer.jointokens` will format and emit the text corresponding
	to accumulated tokens, respecting sentence boundaries and previous
	state.
	* A `reset` function is exported which resets the accumulated tokens and
	the detokenizer state.

	A real text model would be organized differently, but this example should
	suffice to show that many of these advanced integration concepts are just
	simple code.

	A future version of this sample will embed the detokenizer vocabulary as
	rodata in the main module and use that to initialize the internal lookup
	table.