Adapted Deep Key Generation Using Fourier–Riesz Features for Secure Video Encryption

Haingonirina Ignace Rajaosolomanantena; Toky Basilide Ravaliminoarimalalason; Hery Zo Andriamanohisoa

doi:10.24018/ejai.2026.5.2.70120

Research Article

Haingonirina Ignace Rajaosolomanantena

University of Antananarivo, Madagascar

* Corresponding author

Toky Basilide Ravaliminoarimalalason

University of Antananarivo, Madagascar

Hery Zo Andriamanohisoa

University of Antananarivo, Madagascar

10.24018/ejai.2026.5.2.70120

Read Counter
612

Downloads
91

Citations

Share

Submitted 2026-01-26
Published 2026-04-16

Read counter = 612 times

Abstract

Video encryption protects multimedia data over insecure networks. This paper introduces a hybrid key-generation framework combining Fourier– Riesz features with an adapted deep neural model to produce dynamic, frame-dependent keys. A four-channel representation integrating spectral magnitude, spectral phase, directional amplitude, and orientation ensures key decorrelation. Experiments conducted on standard video datasets showed entropy values ranging between 7.96 and 7.99 bits, a strong avalanche effect with an average Hamming distance of 129.62, near-zero inter-frame and inter-channel correlations, and preserved visual quality with a PSNR of 42 dB. Security analysis confirmed overall robustness through extensive evaluations.

Keywords: Deep learning fourier transform key generation riesz transform

Introduction

The rapid growth of video-based applications has made multimedia security a critical challenge, as traditional encryption algorithms such as AES, DES, and RSA remain computationally costly for large-scale video data [1], [2]. Faster alternatives, including selective and chaos-based encryption, improve efficiency but often suffer from reduced robustness and exploitable vulnerabilities [3], [4]. To overcome these limitations, adaptive deep learning–based key generation has emerged as an effective solution by exploiting content-dependent characteristics [5].

This work proposes a hybrid video encryption framework that integrates Fourier and Riesz transforms within a deep neural architecture to capture both spectral and directional information [6]. A spectro-directional tensor is processed by an orthogonally constrained network, ensuring key stability, decorrelation, and numerical robustness [7]. Hybrid activation, adaptive training, and Jacobian control enable high entropy, strong avalanche effects, and inter-frame independence. Extensive experiments confirm the effectiveness of the proposed framework in achieving secure, robust, and high-quality video encryption.

Literature Review

Recent studies show that deep learning is increasingly used for secret key generation from biometric data. Symmetric keys have been generated from fingerprint images using a VGG-16 network [8], while multimodal biometric fusion combining face and finger-vein features with FaceNet, VGG19, and siamese architectures has been used to derive stable keys [9]. Post-quantum compatible keys based on facial CNNs and code-based extractors were proposed in [10], and high-entropy fingerprint-based keys using CNNs with Particle Swarm Optimization were introduced in [11]. In parallel, encryption keys derived from trinion Fourier transforms driven by chaotic systems were presented in [12], without deep learning or temporal adaptation. Unlike these static approaches, the present work focuses on dynamic video sequences using a temporally adaptive Fourier–Riesz deep model for content-dependent key generation.

Materials and Methods

Video Datasets

For experimental evaluation, the Akiyo sequence from the Xiph.org Video Test Media Repository, a standard YUV video collection, was used as a reference, comprising 300 frames [13]. Each frame, representing both static and dynamic scenes, was extracted and resized to 128 × 128 pixels before feature extraction.

Hardware and Software Environment

Experiments were conducted on a Windows 10 platform using Python 3.10 with TensorFlow/Keras. Training was performed on a system equipped with an 8-core CPU, 16 GB RAM, and an NVIDIA GPU with 8 GB VRAM, enabling efficient tensor processing and accelerated optimization.

Feature Extraction

In the proposed pipeline, four complementary features are extracted from each video frame in order to build a compact yet expressive spectro-directional representation. More precisely, we derive a spectral magnitude map $M_{t}$ , a spectral phase component $Φ_{t}$ , a Riesz-based directional response $R_{t}$ , and a temporal variation map $Θ_{t}$ . These extracted features form a compact structure that enables the generation of content-dependent keys.

Network Architecture

Adapted Dense Layer

We define a tensor-dependent orthogonal projection based on ${\bar{Ψ}}_{t}$ :

(1)

\begin{array}{rcl} y = Ω ({\bar{ψ}}_{t}) \cdot x + b, Ω ({\bar{ψ}}_{t}) \in O (n) \end{array}

where $Ω ({\bar{ψ}}_{t})$ is an orthogonal matrix dependent on the tensor ${\bar{ψ}}_{t}$ belonging to the set $O (n)$ of orthogonal matrices of dimension $n$ and $b$ is a bias term.

Here, the matrix $Ω ({\bar{ψ}}_{t})$ is regularized to remain orthogonal:

(2)

\begin{array}{rcl} Ω {({\bar{ψ}}_{t})}^{T} \cdot Ω ({\bar{ψ}}_{t}) = I \end{array}

Activation Function

The proposed activation function is not fixed but depends on the spectral-directional features:

(3)

\begin{array}{rcl} φ_{H y b r i d e} (x) = α ({\bar{ψ}}_{t}) \cdot R e L U (x) + (1 - α ({\bar{ψ}}_{t})) \cdot σ (x) \end{array}

where $R e L U (x)$ denotes the Rectified linear function, $σ (x)$ denotes the sigmoid function, and $α ({\bar{ψ}}_{t}) \in (0, 1)$ is a coefficient dynamically computed from ${\bar{ψ}}_{t}$ .

Jacobian

The input-output Jacobian, as mentioned in [14], factorizes layer by layer:

(4)

\begin{array}{rcl} J_{t} = \frac{\partial K_{t}}{\partial {\bar{ψ}}_{t}} = \prod_{l = 1}^{L} (Ω ({\bar{ψ}}_{t}) . D i a g (φ_{h y b r i d e}^{'} (z_{l}))) \end{array}

where $K_{t}$ denotes the encryption key at frame $t$ , ${\bar{ψ}}_{t}$ is the normalized spectral-directional tensor, $Ω ({\bar{ψ}}_{t})$ is the weight matrix of layer $l$ , $φ_{h y b r i d e}^{'} (z_{l})$ is the derivative of the hybrid activation function and $z_{l}$ is the pre-activation at layer $l$ .

(5)

\begin{array}{rcl} z_{l} = Ω ({\bar{ψ}}_{t}) h_{l - 1} + b_{l} \end{array}

where $Ω ({\bar{ψ}}_{t})$ is an adaptive orthogonal matrix dependent on the tensor ${\bar{ψ}}_{t}$ , $h_{l - 1}$ represents the outputs of the previous layer, and $b_{l}$ is the bias vector of layer $l$ .

Owing to the bounded nature of the hybrid activation derivative, we impose:

(6)

\begin{array}{rcl} 0 < S_{l} \leq φ_{h y b r i d e}^{'} (Z_{l}) \leq {\bar{S}}_{l} \leq 1 \end{array}

Ensuring that the Frobenius norm ${‖ J_{t} ‖}_{F}$ remains controlled. The lower bound $S_{l}$ prevents degenerate mapping with vanishing Jacobian norm, while the upper bound ${\bar{S}}_{l} \leq 1$ avoids gradient explosion.

Training Procedure

Training Hyperparameters and Configuration

The network was trained using gradient descent with an adaptive learning rate initialized at η₀ = 10⁻³ and dynamically modulated according to the spectral energy of the input tensor. The number of epochs was set to E = 50, as the proposed orthogonally constrained and Jacobian-regularized architecture exhibited rapid convergence. No mini-batching was used, as training was performed sequentially frame by frame. The composite loss weights were empirically fixed as follows: orthogonality penalty λ₁ = 0.1, inter-frame decorrelation λ₂ = 0.5, and Jacobian margin constraint λ₃ = 0.2.

Frame-Wise Local and Adaptive Learning

For each frame $t$ , the network parameters $θ_{t}$ are locally updated as:

(7)

\begin{array}{rcl} θ_{t + 1} = θ_{t} - η_{t} \nabla_{θ} ℒ_{u n s u p} (f_{t}, N_{θ} ({\bar{ψ}}_{t})) \end{array}

where $N_{θ}$ denotes the deep neural network parameterized by $θ$ , applied to the normalized tensor ${\bar{ψ}}_{t}$ denotes the normalized input tensor at time $t$ ; $f_{t}$ denotes the frame at time $t$ ; $ℒ_{u n s u p}$ (⋅,⋅) denotes the unsupervised loss function; $\nabla_{θ}$ denotes the gradient of the loss with respect to the parameters $θ$ ; and $η_{t}$ denotes the learning rate at iteration $t$ .

The learning rate $η_{t}$ was itself modulated by the spectral energy of the tensor:

(8)

\begin{array}{rcl} η_{t} = h (E_{s p e c} ({\bar{ψ}}_{t})) \end{array}

where ${\bar{ψ}}_{t}$ denotes the normalized input tensor at time $t$ , and $E_{s p e c} ({\bar{ψ}}_{t})$ denotes the spectral energy of ${\bar{ψ}}_{t}$ .

As a result, the update becomes self-adaptive: each frame adjusts the learning rate according to its spectral content.

Loss Function

The unsupervised objective $ℒ_{u n s u p}$ enforces orthogonality, temporal decorrelation, and minimum sensitivity.

In accordance with (1), the dense layer is parameterised by a square orthogonal matrix conditioned on the spectral–directional tensor, denoted $Ω ({\bar{ψ}}_{t})$ . After each gradient update, we enforce orthogonality by reprojecting the updated matrix onto the orthogonal manifold using a QR factorisation and retaining the Q factor, such that:

(9)

\begin{array}{rcl} Ω {({\bar{ψ}}_{t})}^{T} \cdot Ω ({\bar{ψ}}_{t}) = I \end{array}

To further stabilise optimisation, we add a soft orthogonality penalty:

(10)

\begin{array}{rcl} L_{o r t h} = {‖ Ω {({\bar{ψ}}_{t})}^{T} \cdot Ω ({\bar{ψ}}_{t}) - I ‖}_{F}^{2} \end{array}

To enforce inter-frame independence, we minimize the squared Pearson correlation between successive keys:

(11)

\begin{array}{rcl} L_{d i v} = c o r r (K_{t}, K_{t + 1})^{2} \end{array}

where corr(·,·) is the Pearson correlation on vectorized keys, averaged over the mini-batch.

Finally, we enforce a minimum Jacobian norm to promote the avalanche effect:

(12)

\begin{array}{rcl} L_{j a c} = max (0, ε - {‖ J_{t} ‖}_{F})^{2} \end{array}

where $J_{t} = \frac{\partial K_{t}}{\partial {\bar{ψ}}_{t}}$ denotes the Jacobian of the key with respect to the normalised input tensor.

The overall loss is then given by the weighted sum:

(13)

\begin{array}{rcl} ℒ_{u n s u p} = λ_{o r t h} L_{o r t h} + λ_{d i v} L_{d i v} + λ_{j a c} L_{j a c} \end{array}

Key Generation

The normalized tensor ${\bar{ψ}}_{t} ϵ R^{M \times N \times 4}$ is used as input to an adapted designed deep neural network ${????}_{θ}$ with the following configuration.

The network takes ${\bar{ψ}}_{t}$ as input and outputs an encryption key in three channels Red, Green and Blue (RGB):

(14)

\begin{array}{rcl} K_{t} = {????}_{θ} ({\bar{ψ}}_{t}) ϵ {[0, 1]}^{M \times N \times 3} \end{array}

where ${????}_{θ}$ denotes the deep neural network parameterized by $θ$ applied to the normalized tensor ${\bar{ψ}}_{t}$ , and ${[0, 1]}^{M \times N \times 3}$ represents the three-dimensional real space of dimensions $M \times N \times 3$ .

The network generates a continuous RGB encryption key that is scaled to an 8-bit integer array; therefore, the channel index c $ϵ {R, G, B}$ is introduced in the following formula:

(15)

\begin{array}{rcl} K_{t}^{'} (x, y, c) = ⌊ 225 \cdot K_{t} (x, y, c) ⌋ ϵ Z_{256} \end{array}

for c $ϵ {R, G, B}$

where $K_{t} (x, y, c)$ represents the continuous encryption key generated by the deep neural network, $c$ denotes the $R, G$ or $B$ channel, and $Z_{256}$ denotes the set of integers from 0 to 255.

Video Encryption

The XOR-based encryption is then performed as:

(16)

\begin{array}{rcl} f_{t}^{(c)} (x, y, c) = f_{t}^{(R G B)} (x, y, c) \oplus K_{t}^{'} (x, y, c) \end{array}

\begin{array}{rcl} \forall c ϵ {R, G, B} \end{array}

where $f_{t}^{(R G B)}$ (x, y, c) represents the original pixel for channel $c$ , and ⊕ denotes the XOR encryption operation with the key $K_{t}^{'}$ , with c denoting the R, G, or B channel.

Video Decryption

As mentioned in [15] regarding XOR properties, if a pixel $f$ has been encrypted with a key $k$ , the original can be recovered by:

(17)

\begin{array}{rcl} \hat{f_{t}} (x, y, c) = f_{t}^{(c)} (x, y, c) \oplus K_{t}^{'} (x, y, c) \end{array}

where $f_{t}^{(c)}$ (x, y, c) denotes the encrypted pixel at time t and channel c, $K_{t}^{'} (x, y, c)$ refers to the key, identical to the one used during encryption, and $\hat{f_{t}} (x, y, c);$ represents the decrypted pixel [16].

Algorithm

Threat Model and Security Properties

Security was evaluated under standard threat models, including Ciphertext-Only Attack, Known-Plaintext Attack, and Chosen-Plaintext Attack, in accordance with Kerckhoffs’ principle. The spectro-directional deep key generator produced frame-wise, content-dependent keys, preventing key reuse and minimizing temporal correlations. Adaptive learning and hybrid activation introduced strong nonlinearity, while Jacobian-constrained training and orthogonality ensured high entropy, avalanche effect, and statistical independence. Consequently, the framework provided robust security despite the use of XOR-based encryption.

Results

This section presents results on key quality and their effect on the security and robustness of 3D data encryption [17], [18].

Analysis and Evaluation of Generated Keys

Key quality and security were assessed using metrics for randomness, uniqueness, and robustness [19].

Key Entropy

Fig. 1 shows the cumulative distribution of key entropy, illustrating the keys’ uniformity.

The generated keys exhibited a mean entropy of 7.67 bits per byte, with a 95% confidence interval that remained above weak-randomness thresholds, thereby confirming strong statistical randomness and cryptographic suitability [20].

Avalanche Effect

Fig. 2 shows the avalanche effect, where small input changes greatly alter the generated key [21].

The mean Hamming distance of 129.62 bits with a 95% confidence interval from 129.11 to 130.13 confirmed a balanced avalanche effect, while a one-sample t-test against 128 bits yielded p < 0.001, and the range 116–143 bits demonstrated strong diffusion and resistance to differential attacks [22], [23].

Inter-Frame and Inter-Channel Independence

The following analysis, as Fig. 3 illustrates, evaluates the independence of keys across frames and color channels to ensure high variability and prevent redundancy [24].

Inter-frame correlations averaged 0.002 over 300 frames with a 95% confidence interval from −0.013 to 0.017 and a p-value of 0.77, consistent with [25], confirming strong temporal independence between successive keys [26].

Fig. 4 shows the correlations between the R, G, and B channels of the generated keys, which are near zero, ranging from minus 0.04 to 0.01, indicating minimal redundancy and strong statistical independence.

According to [27], as shown in Table I, the inter-channel correlations were weak, with averages of 0.0129 for RG, −0.045 for RB, and 0.0129 for GB over 20 frames, corresponding to 95% confidence intervals of [−0.0648, 0.0906], [−0.122, 0.032], and [−0.0648, 0.0906], and p-values of 0.73, 0.23, and 0.73, respectively. These results confirm statistical independence and strong, non-redundant key variability [28].

Table I. Descriptive Statistics of Inter-Channel Correlations (R–G, R–B, G–B) over 20 Frames
	Frame	R-G	R-B	G-B
N_f	20	20	20	20
Mean	118.55	0.0129	−0.045	0.0129
Std	74.013	0.166	0.164	0.166
Min	0.00	−0.273	−0.335	−0.273
Q₁	59.00	−0.103	−0.167	−0.103
Q₂	118.50	0.0175	−0.032	0.017
Q₃	178.00	0.126	0.071	0.126
Max	238.00	0.363	0.285	0.363

Evaluation of Video Encryption and Decryption Performance

Video Encryption and Decryption Results

Original and decrypted frames are visually compared in Fig. 5 to assess encryption fidelity [29].

The Akiyo sequence (a) is unreadable after encryption (b) and fully restored after decryption (c), showing the effectiveness of the key generation method [30].

Correlation between Adjacent Pixels

In accordance with [31], the proposed encryption, as depicted in Fig. 6, significantly reduced adjacent-pixel correlation to a negligible level, confirming key effectiveness.

The original videos exhibited a pixel correlation of approximately 0.9, which dropped close to zero after encryption, confirming the effective removal of spatial redundancies.

Validation of Key Effectiveness via Entropy and Directional Correlations

Tables II and III presents the entropy and correlations before and after encryption to validate the effectiveness of the generated keys.

Table II. Entropy and Horizontal/Vertical Correlations
Frame	Entropy (bits)	H-Corr (Orig)	H-Corr (Enc)	V-Corr (Orig)	V-Corr (Enc)
1	7.98	0.91	0.02	0.88	0.00
2	7.97	0.92	0.03	0.89	−0.01
3	7.99	0.93	0.01	0.90	0.02
4	7.96	0.90	0.00	0.87	0.01
5	7.98	0.91	−0.01	0.89	−0.02

Table III. Diagonal Correlations
Frame	D-Corr (Orig)	D-Corr (Enc)
1	0.87	0.01
2	0.88	0.00
3	0.89	−0.01
4	0.86	0.02
5	0.87	0.01

According to Ghouate [30], entropy near 8 bits ensures strong randomness, and our encrypted video reached between 7.96 and 7.99 bits with directional correlations averaging 0.006 over 5 frames with a 95% confidence interval from −0.008 to 0.020, demonstrating effective key generation.

Evaluation of System Robustness against Disturbances

Robustness was tested via PSNR after decrypting videos affected by noise, compression, or data loss.

Our keys provided robust encryption, Table IV highlights that we achieved a PSNR of 33.8 dB over 300 frames with a 95% confidence interval from 33.76 to 33.84 dB, with PSNR of 35.7 dB for JPEG compression and 32 dB for data loss, ensuring reliable visual recovery [32].

Table IV. Evaluation of Encryption Robustness against Different Types of Attacks
Type of attack	Average PSNR	Interpretation
Gaussian noise	~33.8 dB	Minimal visual degradation; video remains usable.
JPEG compression (Q = 75%)	~35.7 dB	The encryption withstands moderate compression well.
Packet loss	~32 dB	Good robustness; video remains intelligible.

Comparison with the Existing Approaches

Comparative Evolution of Frame PSNR

As Fig. 7 reveals, the proposed method achieved a stable PSNR of 42 dB over 300 frames, with a 95% confidence interval of [41.94, 42.06] dB, thus ensuring high visual quality according to [33]. In contrast, Chaotic Maps, Scalable Video Coding (SVC), and Selective Video Encryption (H.264) exhibited lower PSNR values of 34.02 dB, 37.95 dB, and 30.09 dB, respectively, indicating greater visual loss [34]–[37].

Average SSIM Distribution

Fig. 8 illustrates that the Proposed Method achieved an SSIM of 0.95 over 300 frames with a 95% confidence interval from 0.949 to 0.951, exceeding SVC at 0.92 [36] but slightly below the perfectly stable 1.0 reported in [38].

Encrypted Frame Entropy Evolution

Fig. 9 shows that the Proposed Method achieved about 8-bit entropy [38], comparable to Coding Characteristics at 7.89 bits [39] and Block Scrambling at 8 bits.

Ablation Study on the Contribution of Model Components

The ablation results, which Table V reveals, showed that combining Fourier and Riesz features with orthogonality, Jacobian adaptation, and hybrid ReLU/Sigmoid activation produces cryptographic keys of the highest quality.

Table V. Ablation Study Evaluating the Contribution of Each Component to Key Quality
Config.	Feat.	Ortho.	Jac.	Activation	Frame-wise	Ent.	Corr.	Ham.
C₁	F	No	No	ReLU	No	7.65	0.12	121.4
C₂	R	No	No	Sigmoid	No	7.62	0.15	119.8
C₃	F + R	No	No	Tanh	No	7.71	0.07	124.6
C₄	F + R	Yes	No	ReLU/Sigmoid	Partial	7.79	0.03	127.1
C₅ (proposed)	F + R	Yes	Yes	ReLU/Sigmoid	Yes	7.96	≈0.00	129.6

Security Validation via Neural Discriminator Attack

A neural discriminator trained on 70% of 300 frames over 50 epochs achieved 49.8% ± 1.2% accuracy, showing encrypted frames are statistically indistinguishable from noise and confirming robustness against neural attacks.

Computational Performance Analysis

Training took 2.3 min on GPU and 9.8 min on CPU for 50 epochs, with an average inference time of 6.4 ms (GPU) and 28.7 ms (CPU), demonstrating near real-time processing. The theoretical complexity $O (E \times T (N^{2} l o g N + L d^{2} + d^{3}))$ includes $N^{2} l o g N$ for FFT extraction, $L d^{2}$ for forward/backpropagation, and $d^{3}$ for QR factorization, and remains manageable due to GPU parallelization.

Discussion

The proposed method achieved an average key entropy of 7.67 bits, confirming proximity to the theoretical optimum of 8 bits and indicating strong randomness as well as resistance to statistical attacks, as reported in [40]. The avalanche effect produced an average Hamming distance of 129.62 bits with a 95% confidence interval ranging from 129.11 to 130.13 and a p-value below 0.001, thereby satisfying the strict diffusion criteria established in [41]. Adjacent-pixel correlations decreased from values close to 0.9 to statistically indistinguishable values from zero, in accordance with [40], [42]. Ablation analysis showed that configuration C₅ offered the best trade-off between entropy maximization, decorrelation efficiency, and nonlinear sensitivity, consistent with [42], [43].

The decrypted sequences achieved a PSNR close to 42 dB and an SSIM value around 0.95, with confidence intervals confirming stability across all frames. These results demonstrate the superiority of the method over selective encryption approaches described in [39] and chaos-based schemes presented in [44], while maintaining robustness against noise, compression, and packet loss, as shown in [40], [43], [44].

Although the deep Fourier–Riesz framework introduced additional computational cost associated with feature extraction and constrained optimization, and requires hardware acceleration for strict real-time deployment, it offers a favorable trade-off between security, reconstruction fidelity, and statistical stability.

The method also demonstrates resilience against model-extraction attacks, as the neural discriminator failed to recover exploitable patterns, and it limits side-channel leakage through entropy-preserving transformations. Remaining limitations concern computational load and sensitivity to training diversity.

Conclusion

This research presented a novel framework for dynamic and adaptive key generation, leveraging Fourier–Riesz features combined with deep learning. The approach produces high-entropy, decorrelated, and robust keys, ensuring strong cryptographic properties for videos. Experimental results demonstrated that deep spectro-directional features effectively capture temporal and spatial variations, providing robust and independent keys for each frame. Future work will focus on optimizing the key generation process, integrating the framework into modern codecs such as High Efficiency Video Coding (H.265/HEVC), evaluating performance on high-resolution video sequences, exploring alternative spectro-directional transformations, and developing adaptive mechanisms to enhance robustness and scalability in dynamic video scenarios.

Conflict of Interest

The authors declare that they do not have any conflict of interest.

References

Shahid Z, Chaumont M, Puech W. Fast protection of H.264/AVC by selective encryption of CAVLC and CABAC for I and P frames. IEEE Trans Circ Syst Video Technol. 2011;21(5):565–76.
Google Scholar

Li S, Chen G, Zheng X. Chaos-based encryption for digital images and videos. Chaos Solitons Fractals. 2004;22(2):341–61.
Google Scholar

Lian S. Multimedia content encryption techniques: current status and challenges. Signal Process: Image Commun. 2008;23(3):230–47.
Google Scholar

Liu F, Koenig H. A survey of video encryption algorithms. Comput Secur. 2010;29(1):315.
Google Scholar

Mousavi A, Baraniuk R. Learning to invert: signal recovery via deep convolutional networks. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2017.
Google Scholar

Unser M, Van De Ville D. Wavelet steerability and the higher-order Riesz transform. IEEE Trans Image Process. 2010;19(3):636–52.
Google Scholar

Vorontsov A, Sun X, Burda M, Turner R. Orthogonality constraints in neural networks through Lie algebra parametrization. Proceedings of the AAAI Conference on Artificial Intelligence, 2020.
Google Scholar

Hashem MI, Kuban KH. Key generation method from fingerprint image based on deep convolutional neural network model. Nexo Revista Científica. 2023;36(6):906–25.
Google Scholar

Kuznetsov O, Zakharov D, Frontoni E. Deep learning-based biometric cryptographic key generation with post-quantum security. Multimed Tools Appl. 2024;83(19):56909–38.
Google Scholar

Yirga TG, Yirga HG, Addisu EG. Cryptographic key generation using deep learning with biometric face and finger vein data. Front Artif Intell. 2025;8:1543268.
Google Scholar

Erkan U, Toktas A, Enginoğlu S, Akbacak E, Thanh DNH. An image encryption scheme based on chaotic logarithmic map and key generation using deep CNN. Multimed Tools Appl. 2022;81: 7365–91.
Google Scholar

Wang X, Shao Z, Li B, Fu B, Shang Y, Liu X. Color image encryption based on discrete trinion Fourier transform and compressive sensing. Multimed Tools Appl. 2024;83(26):67701–22.
Google Scholar

Video Test Media. YUV video sequences dataset. 2019. [Online]. Available from: https://media.xiph.org/video/derf/. [Accessed: Mar 30, 2026].
Google Scholar

Jakubovitz D, Giryes R. Improving DNN Robustness to Adversarial Attacks Using Jacobian Regularization. Tel Aviv University, Tech. Rep.; 2018.
Google Scholar

Lizama-Pérez LA. XOR Chain and Perfect Secrecy at the Dawn of the Quantum Era. Universidad Técnica Federico Santa María, Tech. Rep.; 2019.
Google Scholar

Schneier B. Applied Cryptography: Protocols, Algorithms, and Source Code in C. 2nd ed. New York, NY, USA: Wiley; 2015.
Google Scholar

Menezes A, Van Oorschot P, Vanstone S. Handbook of Applied Cryptography. Boca Raton, FL, USA: CRC Press; 1996.
Google Scholar

Stallings W. Cryptography and Network Security: Principles and Practice. Pearson; 2017.
Google Scholar

National Institute of Standards and Technology (NIST). Security Requirements for Cryptographic Modules. FIPS PUB; 2001. p.140–2.
Google Scholar

Contreras-Rodríguez L, Madarro-Capó EJ, Contreras-Rodríguez L, Legón-Pérez CM, Rojas O, Sosa-Gómez G. Selecting an effective entropy estimator for short sequences of bits and bytes with maximum entropy. Entropy. 2021;23:561.
Google Scholar

Matsui M. Linear cryptanalysis method for DES cipher. In Advances in Cryptology–EUROCRYPT ’93, LNCS 765. Springer, 1994. pp. 386–97.
Google Scholar

Biham E, Shamir A. Differential cryptanalysis of DES-like cryptosystems. J Cryptol. 1991;4(1):3–72.
Google Scholar

Daemen J, Rijmen V. The Design of Rijndael: AES—The Advanced Encryption Standard. Springer; 2002.
Google Scholar

Shannon CE. Communication theory of secrecy systems. Bell Syst Tech J. 1949;28(4):656–715.
Google Scholar

Wang X, Yu H. How to break MD5 and other hash functions. In Advances in Cryptology–EUROCRYPT 2005. Springer, 2005. pp. 19–35.
Google Scholar

Li C, Lin D, Lo K. Cryptanalysis of an image encryption scheme based on a compound chaotic sequence. Signal Process: Image Commun. 2017;52:130–9.
Google Scholar

Wu Y, Noonan JP, Agaian S. NPCR and UACI randomness tests for image encryption. Cyber J: Multidiscip J Sci Technol. 2011;1(2):31–8.
Google Scholar

Chen G, Mao Y, Chui CK. A symmetric image encryption scheme based on 3D chaotic cat maps. Chaos Solitons Fractals. 2004;21(3):749–61.
Google Scholar

Liu H, Wang X. Color image encryption using spatial chaotic systems. Signal Process. 2012;92(12):3492–501.
Google Scholar

Ghouate NE. A high-entropy image encryption scheme using optimized chaotic maps. Sci Rep. 2025;15(1):14784.
Google Scholar

Alexan W. A secure and efficient image encryption scheme based on a 5D hyperchaotic system. Sci Rep. 2025;15(1):15794.
Google Scholar

Gao S, Liu J, Iu HHC, Erkan S, Zhou S, Wu R, et al. Development of a video encryption algorithm for critical areas using 2D extended Schaffer function map and neural networks. Signal Process: Image Commun. 2024;117:103227.
Google Scholar

Kanungo A, Srivastava A, Anklesaria S, Churi P. A systematic review on video encryption algorithms: a future research. J Auton Intell. 2023;6(2):1–12.
Google Scholar

Salama WM, Aly MH. Chaotic Maps Based Video Encryption: A New Approach. Pharos University/AASTMT; 2020.
Google Scholar

Elkamchouchi H, Salama WM, Abouelseoud Y. New Video Encryption Schemes Based on Chaotic Maps. IET Image Processing; 2020.
Google Scholar

Wang H. A multi-level secure video encryption framework integrating scalable video coding with joint source-channel cryptography. Proceedings of the CONF-MPCS Symposium, 2025.
Google Scholar

Goyal D, Hemrajani N. Novel selective video encryption for H.264 video. Int J Inform Secur Sci. 2014;3(4):5161.
Google Scholar

Hosny KM, Zaki MA, Lashin NA, Hamza HM. Fast colored video encryption using block scrambling and multi-key generation. Vis Comput. 2023;39(12):6041–72.
Google Scholar

Cheng S, Wang L, Ao N, Han Q. A selective video encryption scheme based on coding characteristics. Symmetry. 2020;12(3):332.
Google Scholar

Das S, Jagan L, Singh GK, Kumar S, Rout J, Soni A, et al. Multilayered digital image encryption approach to resist cryptographic attacks for cybersecurity. PeerJ Comput Sci. 2025;11:e3260.
Google Scholar

Castro JCH, Sierra JM, Seznec A, Izquierdo A, Ribagorda A, et al. The strict avalanche criterion randomness test. Math Comput Simul. 2005;68(1):17.
Google Scholar

Panwar K, Kukreja S, Singh A, Singh KK. Towards deep learning for efficient image encryption. Procedia Comput Sci. 2023;218:644–50.
Google Scholar

Wang M, Fu X, Yan X, Teng L. A new chaos-based image encryption algorithm based on discrete Fourier transform and improved Joseph traversal. Mathematics. 2024;12(5):638.
Google Scholar

Xu H, Tong XJ, Zhang M, Wang Z, Peng J. Dynamic video encryption algorithm for H.264/AVC based on a spatiotemporal chaos system. J Opt Soc Am A. 2016;33(6):1166–74.
Google Scholar

Downloads

PDF
HTML
EPUB
JATS XML

How to Cite

Adapted Deep Key Generation Using Fourier–Riesz Features for Secure Video Encryption. (2026). European Journal of Artificial Intelligence and Machine Learning, 5(2), 19-27. https://doi.org/10.24018/ejai.2026.5.2.70120

Issue

Vol. 5 No. 2 (2026)

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

[1] Shahid Z, Chaumont M, Puech W. Fast protection of H.264/AVC by selective encryption of CAVLC and CABAC for I and P frames. IEEE Trans Circ Syst Video Technol. 2011;21(5):565–76.
Google Scholar

[2] Li S, Chen G, Zheng X. Chaos-based encryption for digital images and videos. Chaos Solitons Fractals. 2004;22(2):341–61.
Google Scholar

[3] Lian S. Multimedia content encryption techniques: current status and challenges. Signal Process: Image Commun. 2008;23(3):230–47.
Google Scholar

[4] Liu F, Koenig H. A survey of video encryption algorithms. Comput Secur. 2010;29(1):315.
Google Scholar

[5] Mousavi A, Baraniuk R. Learning to invert: signal recovery via deep convolutional networks. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2017.
Google Scholar

[6] Unser M, Van De Ville D. Wavelet steerability and the higher-order Riesz transform. IEEE Trans Image Process. 2010;19(3):636–52.
Google Scholar

[7] Vorontsov A, Sun X, Burda M, Turner R. Orthogonality constraints in neural networks through Lie algebra parametrization. Proceedings of the AAAI Conference on Artificial Intelligence, 2020.
Google Scholar

[8] Hashem MI, Kuban KH. Key generation method from fingerprint image based on deep convolutional neural network model. Nexo Revista Científica. 2023;36(6):906–25.
Google Scholar

[9] Kuznetsov O, Zakharov D, Frontoni E. Deep learning-based biometric cryptographic key generation with post-quantum security. Multimed Tools Appl. 2024;83(19):56909–38.
Google Scholar

[10] Yirga TG, Yirga HG, Addisu EG. Cryptographic key generation using deep learning with biometric face and finger vein data. Front Artif Intell. 2025;8:1543268.
Google Scholar

[11] Erkan U, Toktas A, Enginoğlu S, Akbacak E, Thanh DNH. An image encryption scheme based on chaotic logarithmic map and key generation using deep CNN. Multimed Tools Appl. 2022;81: 7365–91.
Google Scholar

[12] Wang X, Shao Z, Li B, Fu B, Shang Y, Liu X. Color image encryption based on discrete trinion Fourier transform and compressive sensing. Multimed Tools Appl. 2024;83(26):67701–22.
Google Scholar

[13] Video Test Media. YUV video sequences dataset. 2019. [Online]. Available from: https://media.xiph.org/video/derf/. [Accessed: Mar 30, 2026].
Google Scholar

[14] Jakubovitz D, Giryes R. Improving DNN Robustness to Adversarial Attacks Using Jacobian Regularization. Tel Aviv University, Tech. Rep.; 2018.
Google Scholar

[15] Lizama-Pérez LA. XOR Chain and Perfect Secrecy at the Dawn of the Quantum Era. Universidad Técnica Federico Santa María, Tech. Rep.; 2019.
Google Scholar

[16] Schneier B. Applied Cryptography: Protocols, Algorithms, and Source Code in C. 2nd ed. New York, NY, USA: Wiley; 2015.
Google Scholar

[17] Menezes A, Van Oorschot P, Vanstone S. Handbook of Applied Cryptography. Boca Raton, FL, USA: CRC Press; 1996.
Google Scholar

[18] Stallings W. Cryptography and Network Security: Principles and Practice. Pearson; 2017.
Google Scholar

[19] National Institute of Standards and Technology (NIST). Security Requirements for Cryptographic Modules. FIPS PUB; 2001. p.140–2.
Google Scholar

[20] Contreras-Rodríguez L, Madarro-Capó EJ, Contreras-Rodríguez L, Legón-Pérez CM, Rojas O, Sosa-Gómez G. Selecting an effective entropy estimator for short sequences of bits and bytes with maximum entropy. Entropy. 2021;23:561.
Google Scholar

[21] Matsui M. Linear cryptanalysis method for DES cipher. In Advances in Cryptology–EUROCRYPT ’93, LNCS 765. Springer, 1994. pp. 386–97.
Google Scholar

[22] Biham E, Shamir A. Differential cryptanalysis of DES-like cryptosystems. J Cryptol. 1991;4(1):3–72.
Google Scholar

[23] Daemen J, Rijmen V. The Design of Rijndael: AES—The Advanced Encryption Standard. Springer; 2002.
Google Scholar

[24] Shannon CE. Communication theory of secrecy systems. Bell Syst Tech J. 1949;28(4):656–715.
Google Scholar

[25] Wang X, Yu H. How to break MD5 and other hash functions. In Advances in Cryptology–EUROCRYPT 2005. Springer, 2005. pp. 19–35.
Google Scholar

[26] Li C, Lin D, Lo K. Cryptanalysis of an image encryption scheme based on a compound chaotic sequence. Signal Process: Image Commun. 2017;52:130–9.
Google Scholar

[27] Wu Y, Noonan JP, Agaian S. NPCR and UACI randomness tests for image encryption. Cyber J: Multidiscip J Sci Technol. 2011;1(2):31–8.
Google Scholar

[28] Chen G, Mao Y, Chui CK. A symmetric image encryption scheme based on 3D chaotic cat maps. Chaos Solitons Fractals. 2004;21(3):749–61.
Google Scholar

[29] Liu H, Wang X. Color image encryption using spatial chaotic systems. Signal Process. 2012;92(12):3492–501.
Google Scholar

[30] Ghouate NE. A high-entropy image encryption scheme using optimized chaotic maps. Sci Rep. 2025;15(1):14784.
Google Scholar

[31] Alexan W. A secure and efficient image encryption scheme based on a 5D hyperchaotic system. Sci Rep. 2025;15(1):15794.
Google Scholar

[32] Gao S, Liu J, Iu HHC, Erkan S, Zhou S, Wu R, et al. Development of a video encryption algorithm for critical areas using 2D extended Schaffer function map and neural networks. Signal Process: Image Commun. 2024;117:103227.
Google Scholar

[33] Kanungo A, Srivastava A, Anklesaria S, Churi P. A systematic review on video encryption algorithms: a future research. J Auton Intell. 2023;6(2):1–12.
Google Scholar

[34] Salama WM, Aly MH. Chaotic Maps Based Video Encryption: A New Approach. Pharos University/AASTMT; 2020.
Google Scholar

[35] Elkamchouchi H, Salama WM, Abouelseoud Y. New Video Encryption Schemes Based on Chaotic Maps. IET Image Processing; 2020.
Google Scholar

[36] Wang H. A multi-level secure video encryption framework integrating scalable video coding with joint source-channel cryptography. Proceedings of the CONF-MPCS Symposium, 2025.
Google Scholar

[37] Goyal D, Hemrajani N. Novel selective video encryption for H.264 video. Int J Inform Secur Sci. 2014;3(4):5161.
Google Scholar

[38] Hosny KM, Zaki MA, Lashin NA, Hamza HM. Fast colored video encryption using block scrambling and multi-key generation. Vis Comput. 2023;39(12):6041–72.
Google Scholar

[39] Cheng S, Wang L, Ao N, Han Q. A selective video encryption scheme based on coding characteristics. Symmetry. 2020;12(3):332.
Google Scholar

[40] Das S, Jagan L, Singh GK, Kumar S, Rout J, Soni A, et al. Multilayered digital image encryption approach to resist cryptographic attacks for cybersecurity. PeerJ Comput Sci. 2025;11:e3260.
Google Scholar

[41] Castro JCH, Sierra JM, Seznec A, Izquierdo A, Ribagorda A, et al. The strict avalanche criterion randomness test. Math Comput Simul. 2005;68(1):17.
Google Scholar

[42] Panwar K, Kukreja S, Singh A, Singh KK. Towards deep learning for efficient image encryption. Procedia Comput Sci. 2023;218:644–50.
Google Scholar

[43] Wang M, Fu X, Yan X, Teng L. A new chaos-based image encryption algorithm based on discrete Fourier transform and improved Joseph traversal. Mathematics. 2024;12(5):638.
Google Scholar

[44] Xu H, Tong XJ, Zhang M, Wang Z, Peng J. Dynamic video encryption algorithm for H.264/AVC based on a spatiotemporal chaos system. J Opt Soc Am A. 2016;33(6):1166–74.
Google Scholar

Adapted Deep Key Generation Using Fourier–Riesz Features for Secure Video Encryption

Article Sidebar

Article Main Content

Introduction

Literature Review

Materials and Methods

Video Datasets

Hardware and Software Environment

Feature Extraction

Network Architecture

Adapted Dense Layer

Activation Function

Jacobian

Training Procedure

Training Hyperparameters and Configuration

Frame-Wise Local and Adaptive Learning

Loss Function

Key Generation

Video Encryption

Video Decryption

Algorithm

Threat Model and Security Properties

Results

Analysis and Evaluation of Generated Keys

Key Entropy

Avalanche Effect

Inter-Frame and Inter-Channel Independence

Evaluation of Video Encryption and Decryption Performance

Video Encryption and Decryption Results

Correlation between Adjacent Pixels

Validation of Key Effectiveness via Entropy and Directional Correlations

Evaluation of System Robustness against Disturbances

Comparison with the Existing Approaches

Comparative Evolution of Frame PSNR

Average SSIM Distribution

Encrypted Frame Entropy Evolution

Ablation Study on the Contribution of Model Components

Security Validation via Neural Discriminator Attack

Computational Performance Analysis

Discussion

Conclusion

Conflict of Interest

References