public abstract class Kernel extends Object implements Cloneable
To write a new kernel, a developer extends the Kernel
class and overrides the Kernel.run()
method.
To execute this kernel, the developer creates a new instance of it and calls Kernel.execute(int globalSize)
with a suitable 'global size'. At runtime
Aparapi will attempt to convert the Kernel.run()
method (and any method called directly or indirectly
by Kernel.run()
) into OpenCL for execution on GPU devices made available via the OpenCL platform.
Note that Kernel.run()
is not called directly. Instead,
the Kernel.execute(int globalSize)
method will cause the overridden Kernel.run()
method to be invoked once for each value in the range 0...globalSize
.
On the first call to Kernel.execute(int _globalSize)
, Aparapi will determine the EXECUTION_MODE of the kernel.
This decision is made dynamically based on two factors:
run()
method (and every method that can be called directly or indirectly from the run()
method)
can be converted into OpenCL.Below is an example Kernel that calculates the square of a set of input values.
class SquareKernel extends Kernel{ private int values[]; private int squares[]; public SquareKernel(int values[]){ this.values = values; squares = new int[values.length]; } public void run() { int gid = getGlobalID(); squares[gid] = values[gid]*values[gid]; } public int[] getSquares(){ return(squares); } }
To execute this kernel, first create a new instance of it and then call execute(Range _range)
.
int[] values = new int[1024]; // fill values array Range range = Range.create(values.length); // create a range 0..1024 SquareKernel kernel = new SquareKernel(values); kernel.execute(range);
When execute(Range)
returns, all the executions of Kernel.run()
have completed and the results are available in the squares
array.
int[] squares = kernel.getSquares(); for (int i=0; i< values.length; i++){ System.out.printf("%4d %4d %8d\n", i, values[i], squares[i]); }
A different approach to creating kernels that avoids extending Kernel is to write an anonymous inner class:
final int[] values = new int[1024]; // fill the values array final int[] squares = new int[values.length]; final Range range = Range.create(values.length); Kernel kernel = new Kernel(){ public void run() { int gid = getGlobalID(); squares[gid] = values[gid]*values[gid]; } }; kernel.execute(range); for (int i=0; i< values.length; i++){ System.out.printf("%4d %4d %8d\n", i, values[i], squares[i]); }
Modifier and Type | Class and Description | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
static interface |
Kernel.Constant
We can use this Annotation to 'tag' intended constant buffers.
|
||||||||||||
class |
Kernel.Entry |
||||||||||||
static class |
Kernel.EXECUTION_MODE
Deprecated.
It is no longer recommended that
EXECUTION_MODE s are used, as a more sophisticated Device
preference mechanism is in place, see KernelManager . Though setExecutionMode(EXECUTION_MODE)
is still honored, the default EXECUTION_MODE is now Kernel.EXECUTION_MODE.AUTO , which indicates that the KernelManager
will determine execution behaviours.
The execution mode ENUM enumerates the possible modes of executing a kernel. One can request a mode of execution using the values below, and query a kernel after it first executes to determine how it executed. Aparapi supports 5 execution modes. Default is GPU.
To request that a kernel is executed in a specific mode, call
int[] values = new int[1024]; // fill values array SquareKernel kernel = new SquareKernel(values); kernel.setExecutionMode(Kernel.EXECUTION_MODE.JTP); kernel.execute(values.length);
Alternatively, the property java -classpath ....;codegen.jar -Dcom.codegen.executionMode=GPU MyApplication Generally setting the execution mode is not recommended (it is best to let Aparapi decide automatically) but the option provides a way to compare a kernel's performance under multiple execution modes. |
||||||||||||
class |
Kernel.KernelState
This class is for internal Kernel state management
|
||||||||||||
static interface |
Kernel.Local
We can use this Annotation to 'tag' intended local buffers.
|
||||||||||||
static interface |
Kernel.NoCL
Annotation which can be applied to either a getter (with usual java bean naming convention relative to an instance field), or to any method
with void return type, which prevents both the method body and any calls to the method being emitted in the generated OpenCL.
|
||||||||||||
static interface |
Kernel.PrivateMemorySpace
We can use this Annotation to 'tag' __private (unshared) array fields.
|
Modifier and Type | Field and Description |
---|---|
static String |
CONSTANT_SUFFIX
We can use this suffix to 'tag' intended constant buffers.
|
static String |
LOCAL_SUFFIX
We can use this suffix to 'tag' intended local buffers.
|
static String |
PRIVATE_SUFFIX
We can use this suffix to 'tag' __private buffers.
|
Constructor and Description |
---|
Kernel() |
Modifier and Type | Method and Description |
---|---|
void |
addExecutionModes(Kernel.EXECUTION_MODE... platforms)
Deprecated.
See
Kernel.EXECUTION_MODE .
set possible fallback path for execution modes. for example setExecutionFallbackPath(GPU,CPU,JTP) will try to use the GPU if it fails it will fall back to OpenCL CPU and finally it will try JTP. |
void |
cancelMultiPass()
Invoking this method flags that once the current pass is complete execution should be abandoned.
|
void |
cleanUpArrays()
Frees the bulk of the resources used by this kernel, by setting array sizes in non-primitive
KernelArg s to 1 (0 size is prohibited) and invoking kernel
execution on a zero size range. |
Kernel |
clone()
When using a Java Thread Pool Aparapi uses clone to copy the initial instance to each thread.
|
Kernel |
compile(Device _device)
Force pre-compilation of the kernel for a given device, without executing it.
|
Kernel |
compile(String _entrypoint,
Device _device)
Force pre-compilation of the kernel for a given device, without executing it.
|
void |
dispose()
Release any resources associated with this Kernel.
|
Kernel |
execute(int _range)
Start execution of
_range kernels. |
Kernel |
execute(int _range,
int _passes)
Start execution of
_passes iterations over the _range of kernels. |
Kernel |
execute(Range _range)
Start execution of
_range kernels. |
Kernel |
execute(Range _range,
int _passes)
Start execution of
_passes iterations of _range kernels. |
Kernel |
execute(String _entrypoint,
Range _range)
Start execution of
globalSize kernels for the given entrypoint. |
Kernel |
execute(String _entrypoint,
Range _range,
int _passes)
Start execution of
globalSize kernels for the given entrypoint. |
void |
executeFallbackAlgorithm(Range _range,
int _passId)
If
hasFallbackAlgorithm() has been overriden to return true, this method should be overriden so as to
apply a single pass of the kernel's logic to the entire _range. |
Kernel |
get(boolean[] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(boolean[][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(boolean[][][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(byte[] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(byte[][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(byte[][][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(char[] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(char[][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(char[][][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(double[] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(double[][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(double[][][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(float[] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(float[][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(float[][][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(int[] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(int[][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(int[][][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(long[] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(long[][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(long[][][] array)
Enqueue a request to return this buffer from the GPU.
|
double |
getAccumulatedExecutionTime()
Determine the total execution time of all previous Kernel.execute(range) calls for all threads
that ran this kernel for the device used in the last kernel execution.
|
double |
getAccumulatedExecutionTimeAllThreads(Device device)
Determine the total execution time of all produced profile reports from all threads that executed the
current kernel on the specified device.
|
double |
getAccumulatedExecutionTimeCurrentThread(Device device)
Determine the total execution time of all previous kernel executions called from the current thread,
calling this method, that executed the current kernel on the specified device.
|
int |
getCancelState() |
double |
getConversionTime()
Determine the time taken to convert bytecode to OpenCL for first Kernel.execute(range) call.
|
int |
getCurrentPass() |
Kernel.EXECUTION_MODE |
getExecutionMode()
Deprecated.
See
Kernel.EXECUTION_MODE
Return the current execution mode. Before a Kernel executes, this return value will be the execution mode as determined by the setting of the EXECUTION_MODE enumeration. By default, this setting is either GPU if OpenCL is available on the target system, or JTP otherwise. This default setting can be changed by calling setExecutionMode(). After a Kernel executes, the return value will be the mode in which the Kernel actually executed. |
double |
getExecutionTime()
Determine the execution time of the previous Kernel.execute(range) called from the last thread that ran and
executed on the most recently used device.
|
int[] |
getKernelCompileWorkGroupSize(Device device)
Retrieves the specified work-group size in the compiled kernel for the specified device or intermediate language for the device.
|
long |
getKernelLocalMemSizeInUse(Device device)
Retrieves the amount of local memory used in the specified device by this kernel instance.
|
int |
getKernelMaxWorkGroupSize(Device device)
Retrieves the maximum work-group size allowed for this kernel when running on the specified device.
|
long |
getKernelMinimumPrivateMemSizeInUsePerWorkItem(Device device)
Retrieves that minimum private memory in use per work item for this kernel instance and
the specified device.
|
int |
getKernelPreferredWorkGroupSizeMultiple(Device device)
Retrieves the preferred work-group multiple in the specified device for this kernel instance.
|
Kernel.KernelState |
getKernelState() |
static String |
getMappedMethodName(ClassModel.ConstantPool.MethodReferenceEntry _methodReferenceEntry) |
List<ProfileInfo> |
getProfileInfo()
Get the profiling information from the last successful call to Kernel.execute().
|
WeakReference<ProfileReport> |
getProfileReportCurrentThread(Device device)
Retrieves the most recent complete report available for the current thread calling this method for
the current kernel instance and executed on the given device.
|
WeakReference<ProfileReport> |
getProfileReportLastThread(Device device)
Retrieves a profile report for the last thread that executed this kernel on the given device.
|
Device |
getTargetDevice() |
boolean |
hasFallbackAlgorithm()
False by default.
|
boolean |
hasNextExecutionMode()
Deprecated.
|
static void |
invalidateCaches() |
boolean |
isAllowDevice(Device _device) |
boolean |
isAutoCleanUpArrays() |
boolean |
isExecuting() |
boolean |
isExplicit()
For dev purposes (we should remove this for production) determine whether this Kernel uses explicit memory management
|
static boolean |
isMappedMethod(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) |
static boolean |
isOpenCLDelegateMethod(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) |
boolean |
isRunningCL() |
Kernel |
put(boolean[] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(boolean[][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(boolean[][][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(byte[] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(byte[][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(byte[][][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(char[] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(char[][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(char[][][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(double[] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(double[][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(double[][][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(float[] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(float[][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(float[][][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(int[] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(int[][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(int[][][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(long[] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(long[][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(long[][][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
void |
registerProfileReportObserver(IProfileReportObserver observer)
Registers a new profile report observer to receive profile reports as they're produced.
|
abstract void |
run()
The entry point of a kernel.
|
void |
setAutoCleanUpArrays(boolean autoCleanUpArrays)
Property which if true enables automatic calling of
cleanUpArrays() following each execution. |
void |
setExecutionMode(Kernel.EXECUTION_MODE _executionMode)
Deprecated.
See
Kernel.EXECUTION_MODE
Set the execution mode. This should be regarded as a request. The real mode will be determined at runtime based on the availability of OpenCL and the characteristics of the workload. |
void |
setExecutionModeWithoutFallback(Kernel.EXECUTION_MODE _executionMode) |
void |
setExplicit(boolean _explicit)
For dev purposes (we should remove this for production) allow us to define that this Kernel uses explicit memory management
|
void |
setFallbackExecutionMode()
Deprecated.
|
String |
toString() |
void |
tryNextExecutionMode()
Deprecated.
See
Kernel.EXECUTION_MODE .
try the next execution path in the list if there aren't any more than give up |
static boolean |
usesAtomic32(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) |
static boolean |
usesAtomic64(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) |
public static final String LOCAL_SUFFIX
int[] buffer_$local$ = new int[1024];
Or use the Annotation form
@Local int[] buffer = new int[1024];
public static final String CONSTANT_SUFFIX
int[] buffer_$constant$ = new int[1024];
Or use the Annotation form
@Constant int[] buffer = new int[1024];
public static final String PRIVATE_SUFFIX
So either name the buffer
int[] buffer_$private$32 = new int[32];
Or use the Annotation form
@PrivateMemorySpace(32) int[] buffer = new int[32];
public abstract void run()
Every kernel must override this method.
public boolean hasFallbackAlgorithm()
executeFallbackAlgorithm(Range, int)
with the alternate
algorithm.public void executeFallbackAlgorithm(Range _range, int _passId)
hasFallbackAlgorithm()
has been overriden to return true, this method should be overriden so as to
apply a single pass of the kernel's logic to the entire _range.
This is not normally required, as fallback to JavaDevice.THREAD_POOL
will implement the algorithm in parallel. However
in the event that thread pool execution may be prohibitively slow, this method might implement a "quick and dirty" approximation
to the desired result (for example, a simple box-blur as opposed to a gaussian blur in an image processing application).
public void cancelMultiPass()
Note that in the case of thread-pool/pure java execution we could do better already, using Thread.interrupt() (and/or other means) to abandon execution mid-pass. However at present this is not attempted.
public int getCancelState()
public int getCurrentPass()
KernelRunner.getCurrentPass()
public boolean isExecuting()
KernelRunner.isExecuting()
public Kernel clone()
If you choose to override clone()
you are responsible for delegating to super.clone();
public Kernel.KernelState getKernelState()
public void registerProfileReportObserver(IProfileReportObserver observer)
null
value.
observer
- the observer instance that will receive the profile reportspublic WeakReference<ProfileReport> getProfileReportLastThread(Device device)
ProfileReport.clone()
device
- the relevant device where the kernel executedgetProfileReportCurrentThread(Device)
,
registerProfileReportObserver(IProfileReportObserver)
,
getAccumulatedExecutionTimeAllThreads(Device)
,
#getExecutionTimeLastThread()
,
#getConversionTimeLastThread()
public WeakReference<ProfileReport> getProfileReportCurrentThread(Device device)
ProfileReport.clone()
device
- the relevant device where the kernel executedgetProfileReportLastThread(Device)
,
registerProfileReportObserver(IProfileReportObserver)
,
#getExecutionTimeCurrentThread(Device)
,
#getConversionTimeCurrentThread(Device)
,
getAccumulatedExecutionTimeAllThreads(Device)
public double getExecutionTime()
getProfileReportLastThread(Device)
or registerProfileReportObserver(IProfileReportObserver)
is encouraged instead.getProfileReportCurrentThread(Device)
,
registerProfileReportObserver(IProfileReportObserver)
,
getAccumulatedExecutionTimeAllThreads(Device)
,
getConversionTime();
,
getAccumulatedExecutionTime();
public double getConversionTime()
getProfileReportLastThread(Device)
or registerProfileReportObserver(IProfileReportObserver)
is encouraged instead.getProfileReportCurrentThread(Device)
,
registerProfileReportObserver(IProfileReportObserver)
,
getAccumulatedExecutionTimeAllThreads(Device)
,
getAccumulatedExecutionTime();
,
getExecutionTime();
public double getAccumulatedExecutionTimeCurrentThread(Device device)
the
- device of interest where the kernel executedgetProfileReportCurrentThread(Device)
,
getProfileReportLastThread(Device)
,
registerProfileReportObserver(IProfileReportObserver)
,
getAccumulatedExecutionTimeAllThreads(Device)
public double getAccumulatedExecutionTimeAllThreads(Device device)
the
- device of interest where the kernel executedgetProfileReportCurrentThread(Device)
,
getProfileReportLastThread(Device)
,
registerProfileReportObserver(IProfileReportObserver)
,
getAccumulatedExecutionTimeCurrentThread(Device)
public double getAccumulatedExecutionTime()
getAccumulatedExecutionTimeAllThreads(Device)
is encouraged instead.#getAccumulatedExecutionTime(Device));
,
#getProfileReport(Device)
,
registerProfileReportObserver(IProfileReportObserver)
,
getExecutionTime();
,
getConversionTime();
public Kernel execute(Range _range)
_range
kernels.
When kernel.execute(globalSize)
is invoked, Aparapi will schedule the execution of globalSize
kernels. If the execution mode is GPU then
the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.
_range
- The number of Kernels that we would like to initiate.public Kernel execute(int _range)
_range
kernels.
When kernel.execute(_range)
is 1invoked, Aparapi will schedule the execution of _range
kernels. If the execution mode is GPU then
the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.
Since adding the new Range class
this method offers backward compatibility and merely defers to return (execute(Range.create(_range), 1));
.
_range
- The number of Kernels that we would like to initiate.public Kernel execute(Range _range, int _passes)
_passes
iterations of _range
kernels.
When kernel.execute(_range, _passes)
is invoked, Aparapi will schedule the execution of _reange
kernels. If the execution mode is GPU then
the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.
_passes
- The number of passes to makepublic Kernel execute(int _range, int _passes)
_passes
iterations over the _range
of kernels.
When kernel.execute(_range)
is invoked, Aparapi will schedule the execution of _range
kernels. If the execution mode is GPU then
the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.
Since adding the new Range class
this method offers backward compatibility and merely defers to return (execute(Range.create(_range), 1));
.
_range
- The number of Kernels that we would like to initiate.public Kernel execute(String _entrypoint, Range _range)
globalSize
kernels for the given entrypoint.
When kernel.execute("entrypoint", globalSize)
is invoked, Aparapi will schedule the execution of globalSize
kernels. If the execution mode is GPU then
the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.
_entrypoint
- is the name of the method we wish to use as the entrypoint to the kernelpublic Kernel execute(String _entrypoint, Range _range, int _passes)
globalSize
kernels for the given entrypoint.
When kernel.execute("entrypoint", globalSize)
is invoked, Aparapi will schedule the execution of globalSize
kernels. If the execution mode is GPU then
the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.
_entrypoint
- is the name of the method we wish to use as the entrypoint to the kernelpublic Kernel compile(Device _device) throws CompileFailedException
_device
- the device for which the kernel is to be compiledCompileFailedException
- if compilation failed for some reasonpublic Kernel compile(String _entrypoint, Device _device) throws CompileFailedException
_entrypoint
- is the name of the method we wish to use as the entrypoint to the kernel_device
- the device for which the kernel is to be compiledCompileFailedException
- if compilation failed for some reasonpublic long getKernelMinimumPrivateMemSizeInUsePerWorkItem(Device device) throws QueryFailedException
device
- the device where the kernel is intended to runQueryFailedException
- if the query couldn't completepublic long getKernelLocalMemSizeInUse(Device device) throws QueryFailedException
device
- the device where the kernel is intended to runQueryFailedException
- if the query couldn't completepublic int getKernelPreferredWorkGroupSizeMultiple(Device device) throws QueryFailedException
device
- the device where the kernel is intended to runQueryFailedException
- if the query couldn't completepublic int getKernelMaxWorkGroupSize(Device device) throws QueryFailedException
device
- the device where the kernel is intended to runQueryFailedException
- if the query couldn't completepublic int[] getKernelCompileWorkGroupSize(Device device) throws QueryFailedException
device
- the device where the kernel is intended to runQueryFailedException
- if the query couldn't completepublic boolean isAutoCleanUpArrays()
public void setAutoCleanUpArrays(boolean autoCleanUpArrays)
cleanUpArrays()
following each execution.public void cleanUpArrays()
KernelArg
s to 1 (0 size is prohibited) and invoking kernel
execution on a zero size range. Unlike dispose()
, this does not prohibit further invocations of this kernel, as sundry resources such as OpenCL queues are
not freed by this method.
This allows a "dormant" Kernel to remain in existence without undue strain on GPU resources, which may be strongly preferable to disposing a Kernel and recreating another one later, as creation/use of a new Kernel (specifically creation of its associated OpenCL context) is expensive.
Note that where the underlying array field is declared final, for obvious reasons it is not resized to zero.
public void dispose()
When the execution mode is CPU
or GPU
, Aparapi stores some OpenCL resources in a data structure associated with the kernel instance. The
dispose()
method must be called to release these resources.
If execute(int _globalSize)
is called after dispose()
is called the results are undefined.
public boolean isRunningCL()
public final Device getTargetDevice()
public boolean isAllowDevice(Device _device)
@Deprecated public Kernel.EXECUTION_MODE getExecutionMode()
Kernel.EXECUTION_MODE
Return the current execution mode. Before a Kernel executes, this return value will be the execution mode as determined by the setting of the EXECUTION_MODE enumeration. By default, this setting is either GPU if OpenCL is available on the target system, or JTP otherwise. This default setting can be changed by calling setExecutionMode().
After a Kernel executes, the return value will be the mode in which the Kernel actually executed.
setExecutionMode(EXECUTION_MODE)
@Deprecated public void setExecutionMode(Kernel.EXECUTION_MODE _executionMode)
Kernel.EXECUTION_MODE
Set the execution mode.
This should be regarded as a request. The real mode will be determined at runtime based on the availability of OpenCL and the characteristics of the workload.
_executionMode
- the requested execution mode.getExecutionMode()
public void setExecutionModeWithoutFallback(Kernel.EXECUTION_MODE _executionMode)
@Deprecated public void setFallbackExecutionMode()
Kernel.EXECUTION_MODE
public static String getMappedMethodName(ClassModel.ConstantPool.MethodReferenceEntry _methodReferenceEntry)
public static boolean isMappedMethod(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry)
public static boolean isOpenCLDelegateMethod(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry)
public static boolean usesAtomic32(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry)
public static boolean usesAtomic64(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry)
public void setExplicit(boolean _explicit)
_explicit
- (true if we want explicit memory management)public boolean isExplicit()
public Kernel put(long[] array)
array
- public Kernel put(long[][] array)
array
- public Kernel put(long[][][] array)
array
- public Kernel put(double[] array)
array
- public Kernel put(double[][] array)
array
- public Kernel put(double[][][] array)
array
- public Kernel put(float[] array)
array
- public Kernel put(float[][] array)
array
- public Kernel put(float[][][] array)
array
- public Kernel put(int[] array)
array
- public Kernel put(int[][] array)
array
- public Kernel put(int[][][] array)
array
- public Kernel put(byte[] array)
array
- public Kernel put(byte[][] array)
array
- public Kernel put(byte[][][] array)
array
- public Kernel put(char[] array)
array
- public Kernel put(char[][] array)
array
- public Kernel put(char[][][] array)
array
- public Kernel put(boolean[] array)
array
- public Kernel put(boolean[][] array)
array
- public Kernel put(boolean[][][] array)
array
- public Kernel get(long[] array)
array
- public Kernel get(long[][] array)
array
- public Kernel get(long[][][] array)
array
- public Kernel get(double[] array)
array
- public Kernel get(double[][] array)
array
- public Kernel get(double[][][] array)
array
- public Kernel get(float[] array)
array
- public Kernel get(float[][] array)
array
- public Kernel get(float[][][] array)
array
- public Kernel get(int[] array)
array
- public Kernel get(int[][] array)
array
- public Kernel get(int[][][] array)
array
- public Kernel get(byte[] array)
array
- public Kernel get(byte[][] array)
array
- public Kernel get(byte[][][] array)
array
- public Kernel get(char[] array)
array
- public Kernel get(char[][] array)
array
- public Kernel get(char[][][] array)
array
- public Kernel get(boolean[] array)
array
- public Kernel get(boolean[][] array)
array
- public Kernel get(boolean[][][] array)
array
- public List<ProfileInfo> getProfileInfo()
@Deprecated public void addExecutionModes(Kernel.EXECUTION_MODE... platforms)
Kernel.EXECUTION_MODE
.
set possible fallback path for execution modes. for example setExecutionFallbackPath(GPU,CPU,JTP) will try to use the GPU if it fails it will fall back to OpenCL CPU and finally it will try JTP.
@Deprecated public boolean hasNextExecutionMode()
Kernel.EXECUTION_MODE
.@Deprecated public void tryNextExecutionMode()
Kernel.EXECUTION_MODE
.
try the next execution path in the list if there aren't any more than give uppublic static void invalidateCaches()
Copyright © 2021 Syncleus. All rights reserved.